dawarich/spec/services/points/raw_data/archiver_spec.rb
Evgenii Burmakin c8242ce902
0.36.3 (#2013)
* fix: move foreman to global gems to fix startup crash (#1971)

* Update exporting code to stream points data to file in batches to red… (#1980)

* Update exporting code to stream points data to file in batches to reduce memory usage

* Update changelog

* Update changelog

* Feature/maplibre frontend (#1953)

* Add a plan to use MapLibre GL JS for the frontend map rendering, replacing Leaflet

* Implement phase 1

* Phases 1-3 + part of 4

* Fix e2e tests

* Phase 6

* Implement fog of war

* Phase 7

* Next step: fix specs, phase 7 done

* Use our own map tiles

* Extract v2 map logic to separate manager classes

* Update settings panel on v2 map

* Update v2 e2e tests structure

* Reimplement location search in maps v2

* Update speed routes

* Implement visits and places creation in v2

* Fix last failing test

* Implement visits merging

* Fix a routes e2e test and simplify the routes layer styling.

* Extract js to modules from maps_v2_controller.js

* Implement area creation

* Fix spec problem

* Fix some e2e tests

* Implement live mode in v2 map

* Update icons and panel

* Extract some styles

* Remove unused file

* Start adding dark theme to popups on MapLibre maps

* Make popups respect dark theme

* Move v2 maps to maplibre namespace

* Update v2 references to maplibre

* Put place, area and visit info into side panel

* Update API to use safe settings config method

* Fix specs

* Fix method name to config in SafeSettings and update usages accordingly

* Add missing public files

* Add handling for real time points

* Fix remembering enabled/disabled layers of the v2 map

* Fix lots of e2e tests

* Add settings to select map version

* Use maps/v2 as main path for MapLibre maps

* Update routing

* Update live mode

* Update maplibre controller

* Update changelog

* Remove some console.log statements

* Pull only necessary data for map v2 points

* Feature/raw data archive (#2009)

* 0.36.2 (#2007)

* fix: move foreman to global gems to fix startup crash (#1971)

* Update exporting code to stream points data to file in batches to red… (#1980)

* Update exporting code to stream points data to file in batches to reduce memory usage

* Update changelog

* Update changelog

* Feature/maplibre frontend (#1953)

* Add a plan to use MapLibre GL JS for the frontend map rendering, replacing Leaflet

* Implement phase 1

* Phases 1-3 + part of 4

* Fix e2e tests

* Phase 6

* Implement fog of war

* Phase 7

* Next step: fix specs, phase 7 done

* Use our own map tiles

* Extract v2 map logic to separate manager classes

* Update settings panel on v2 map

* Update v2 e2e tests structure

* Reimplement location search in maps v2

* Update speed routes

* Implement visits and places creation in v2

* Fix last failing test

* Implement visits merging

* Fix a routes e2e test and simplify the routes layer styling.

* Extract js to modules from maps_v2_controller.js

* Implement area creation

* Fix spec problem

* Fix some e2e tests

* Implement live mode in v2 map

* Update icons and panel

* Extract some styles

* Remove unused file

* Start adding dark theme to popups on MapLibre maps

* Make popups respect dark theme

* Move v2 maps to maplibre namespace

* Update v2 references to maplibre

* Put place, area and visit info into side panel

* Update API to use safe settings config method

* Fix specs

* Fix method name to config in SafeSettings and update usages accordingly

* Add missing public files

* Add handling for real time points

* Fix remembering enabled/disabled layers of the v2 map

* Fix lots of e2e tests

* Add settings to select map version

* Use maps/v2 as main path for MapLibre maps

* Update routing

* Update live mode

* Update maplibre controller

* Update changelog

* Remove some console.log statements

---------

Co-authored-by: Robin Tuszik <mail@robin.gg>

* Remove esbuild scripts from package.json

* Remove sideEffects field from package.json

* Raw data archivation

* Add tests

* Fix tests

* Fix tests

* Update ExceptionReporter

* Add schedule to run raw data archival job monthly

* Change file structure for raw data archival feature

* Update changelog and version for raw data archival feature

---------

Co-authored-by: Robin Tuszik <mail@robin.gg>

* Set raw_data to an empty hash instead of nil when archiving

* Fix storage configuration and file extraction

* Consider MIN_MINUTES_SPENT_IN_CITY during stats calculation (#2018)

* Consider MIN_MINUTES_SPENT_IN_CITY during stats calculation

* Remove raw data from visited cities api endpoint

* Use user timezone to show dates on maps (#2020)

* Fix/pre epoch time (#2019)

* Use user timezone to show dates on maps

* Limit timestamps to valid range to prevent database errors when users enter pre-epoch dates.

* Limit timestamps to valid range to prevent database errors when users enter pre-epoch dates.

* Fix tests failing due to new index on stats table

* Fix failing specs

* Update redis client configuration to support unix socket connection

* Update changelog

* Fix kml kmz import issues (#2023)

* Fix kml kmz import issues

* Refactor KML importer to improve readability and maintainability

* Implement moving points in map v2 and fix route rendering logic to ma… (#2027)

* Implement moving points in map v2 and fix route rendering logic to match map v1.

* Fix route spec

* fix(maplibre): update date format to ISO 8601 (#2029)

* Add verification step to raw data archival process (#2028)

* Add verification step to raw data archival process

* Add actual verification of raw data archives after creation, and only clear raw_data for verified archives.

* Fix failing specs

* Eliminate zip-bomb risk

* Fix potential memory leak in js

* Return .keep files

* Use Toast instead of alert for notifications

* Add help section to navbar dropdown

* Update changelog

* Remove raw_data_archival_job

* Ensure file is being closed properly after reading in Archivable concern

---------

Co-authored-by: Robin Tuszik <mail@robin.gg>
2025-12-14 12:05:59 +01:00

202 lines
6.8 KiB
Ruby

# frozen_string_literal: true
require 'rails_helper'
RSpec.describe Points::RawData::Archiver do
let(:user) { create(:user) }
let(:archiver) { described_class.new }
before do
allow(PointsChannel).to receive(:broadcast_to)
end
describe '#call' do
context 'when archival is disabled' do
before do
allow(ENV).to receive(:[]).and_call_original
allow(ENV).to receive(:[]).with('ARCHIVE_RAW_DATA').and_return('false')
end
it 'returns early without processing' do
result = archiver.call
expect(result).to eq({ processed: 0, archived: 0, failed: 0 })
end
end
context 'when archival is enabled' do
before do
allow(ENV).to receive(:[]).and_call_original
allow(ENV).to receive(:[]).with('ARCHIVE_RAW_DATA').and_return('true')
end
let!(:old_points) do
# Create points 3 months ago (definitely older than 2 month lag)
old_date = 3.months.ago.beginning_of_month
create_list(:point, 5, user: user,
timestamp: old_date.to_i,
raw_data: { lon: 13.4, lat: 52.5 })
end
it 'archives old points' do
expect { archiver.call }.to change(Points::RawDataArchive, :count).by(1)
end
it 'marks points as archived' do
archiver.call
expect(Point.where(raw_data_archived: true).count).to eq(5)
end
it 'keeps raw_data intact (does not clear yet)' do
archiver.call
Point.where(user: user).find_each do |point|
expect(point.raw_data).to eq({ 'lon' => 13.4, 'lat' => 52.5 })
end
end
it 'returns correct stats' do
result = archiver.call
expect(result[:processed]).to eq(1)
expect(result[:archived]).to eq(5)
expect(result[:failed]).to eq(0)
end
end
context 'with points from multiple months' do
before do
allow(ENV).to receive(:[]).and_call_original
allow(ENV).to receive(:[]).with('ARCHIVE_RAW_DATA').and_return('true')
end
let!(:june_points) do
june_date = 4.months.ago.beginning_of_month
create_list(:point, 3, user: user,
timestamp: june_date.to_i,
raw_data: { lon: 13.4, lat: 52.5 })
end
let!(:july_points) do
july_date = 3.months.ago.beginning_of_month
create_list(:point, 2, user: user,
timestamp: july_date.to_i,
raw_data: { lon: 14.0, lat: 53.0 })
end
it 'creates separate archives for each month' do
expect { archiver.call }.to change(Points::RawDataArchive, :count).by(2)
end
it 'archives all points' do
archiver.call
expect(Point.where(raw_data_archived: true).count).to eq(5)
end
end
end
describe '#archive_specific_month' do
before do
allow(ENV).to receive(:[]).and_call_original
allow(ENV).to receive(:[]).with('ARCHIVE_RAW_DATA').and_return('true')
end
let(:test_date) { 3.months.ago.beginning_of_month.utc }
let!(:june_points) do
create_list(:point, 3, user: user,
timestamp: test_date.to_i,
raw_data: { lon: 13.4, lat: 52.5 })
end
it 'archives specific month' do
expect do
archiver.archive_specific_month(user.id, test_date.year, test_date.month)
end.to change(Points::RawDataArchive, :count).by(1)
end
it 'creates archive with correct metadata' do
archiver.archive_specific_month(user.id, test_date.year, test_date.month)
archive = user.raw_data_archives.last
expect(archive.user_id).to eq(user.id)
expect(archive.year).to eq(test_date.year)
expect(archive.month).to eq(test_date.month)
expect(archive.point_count).to eq(3)
expect(archive.chunk_number).to eq(1)
end
it 'attaches compressed file' do
archiver.archive_specific_month(user.id, test_date.year, test_date.month)
archive = user.raw_data_archives.last
expect(archive.file).to be_attached
expect(archive.file.key).to match(%r{raw_data_archives/\d+/\d{4}/\d{2}/001\.jsonl\.gz})
end
end
describe 'append-only architecture' do
before do
allow(ENV).to receive(:[]).and_call_original
allow(ENV).to receive(:[]).with('ARCHIVE_RAW_DATA').and_return('true')
end
# Use UTC from the start to avoid timezone issues
let(:test_date_utc) { 3.months.ago.utc.beginning_of_month }
let!(:june_points_batch1) do
create_list(:point, 2, user: user,
timestamp: test_date_utc.to_i,
raw_data: { lon: 13.4, lat: 52.5 })
end
it 'creates additional chunks for same month' do
# First archival
archiver.archive_specific_month(user.id, test_date_utc.year, test_date_utc.month)
expect(Points::RawDataArchive.for_month(user.id, test_date_utc.year, test_date_utc.month).count).to eq(1)
expect(Points::RawDataArchive.last.chunk_number).to eq(1)
# Verify first batch is archived
june_points_batch1.each(&:reload)
expect(june_points_batch1.all?(&:raw_data_archived)).to be true
# Add more points for same month (retrospective import)
# Use unique timestamps to avoid uniqueness validation errors
mid_month = test_date_utc + 15.days
june_points_batch2 = [
create(:point, user: user, timestamp: mid_month.to_i, raw_data: { lon: 14.0, lat: 53.0 }),
create(:point, user: user, timestamp: (mid_month + 1.hour).to_i, raw_data: { lon: 14.0, lat: 53.0 })
]
# Verify second batch exists and is not archived
expect(june_points_batch2.all? { |p| !p.raw_data_archived }).to be true
# Second archival should create chunk 2
archiver.archive_specific_month(user.id, test_date_utc.year, test_date_utc.month)
expect(Points::RawDataArchive.for_month(user.id, test_date_utc.year, test_date_utc.month).count).to eq(2)
expect(Points::RawDataArchive.last.chunk_number).to eq(2)
end
end
describe 'advisory locking' do
before do
allow(ENV).to receive(:[]).and_call_original
allow(ENV).to receive(:[]).with('ARCHIVE_RAW_DATA').and_return('true')
end
let!(:june_points) do
old_date = 3.months.ago.beginning_of_month
create_list(:point, 2, user: user,
timestamp: old_date.to_i,
raw_data: { lon: 13.4, lat: 52.5 })
end
it 'prevents duplicate processing with advisory locks' do
# Simulate lock couldn't be acquired (returns nil/false)
allow(ActiveRecord::Base).to receive(:with_advisory_lock).and_return(false)
result = archiver.call
expect(result[:processed]).to eq(0)
expect(result[:failed]).to eq(0)
end
end
end