dawarich/lib/tasks/points_raw_data.rake
Evgenii Burmakin 8d2ade1bdc
0.37.0 (#2067)
* fix: move foreman to global gems to fix startup crash (#1971)

* Update exporting code to stream points data to file in batches to red… (#1980)

* Update exporting code to stream points data to file in batches to reduce memory usage

* Update changelog

* Update changelog

* Feature/maplibre frontend (#1953)

* Add a plan to use MapLibre GL JS for the frontend map rendering, replacing Leaflet

* Implement phase 1

* Phases 1-3 + part of 4

* Fix e2e tests

* Phase 6

* Implement fog of war

* Phase 7

* Next step: fix specs, phase 7 done

* Use our own map tiles

* Extract v2 map logic to separate manager classes

* Update settings panel on v2 map

* Update v2 e2e tests structure

* Reimplement location search in maps v2

* Update speed routes

* Implement visits and places creation in v2

* Fix last failing test

* Implement visits merging

* Fix a routes e2e test and simplify the routes layer styling.

* Extract js to modules from maps_v2_controller.js

* Implement area creation

* Fix spec problem

* Fix some e2e tests

* Implement live mode in v2 map

* Update icons and panel

* Extract some styles

* Remove unused file

* Start adding dark theme to popups on MapLibre maps

* Make popups respect dark theme

* Move v2 maps to maplibre namespace

* Update v2 references to maplibre

* Put place, area and visit info into side panel

* Update API to use safe settings config method

* Fix specs

* Fix method name to config in SafeSettings and update usages accordingly

* Add missing public files

* Add handling for real time points

* Fix remembering enabled/disabled layers of the v2 map

* Fix lots of e2e tests

* Add settings to select map version

* Use maps/v2 as main path for MapLibre maps

* Update routing

* Update live mode

* Update maplibre controller

* Update changelog

* Remove some console.log statements

* Pull only necessary data for map v2 points

* Feature/raw data archive (#2009)

* 0.36.2 (#2007)

* fix: move foreman to global gems to fix startup crash (#1971)

* Update exporting code to stream points data to file in batches to red… (#1980)

* Update exporting code to stream points data to file in batches to reduce memory usage

* Update changelog

* Update changelog

* Feature/maplibre frontend (#1953)

* Add a plan to use MapLibre GL JS for the frontend map rendering, replacing Leaflet

* Implement phase 1

* Phases 1-3 + part of 4

* Fix e2e tests

* Phase 6

* Implement fog of war

* Phase 7

* Next step: fix specs, phase 7 done

* Use our own map tiles

* Extract v2 map logic to separate manager classes

* Update settings panel on v2 map

* Update v2 e2e tests structure

* Reimplement location search in maps v2

* Update speed routes

* Implement visits and places creation in v2

* Fix last failing test

* Implement visits merging

* Fix a routes e2e test and simplify the routes layer styling.

* Extract js to modules from maps_v2_controller.js

* Implement area creation

* Fix spec problem

* Fix some e2e tests

* Implement live mode in v2 map

* Update icons and panel

* Extract some styles

* Remove unused file

* Start adding dark theme to popups on MapLibre maps

* Make popups respect dark theme

* Move v2 maps to maplibre namespace

* Update v2 references to maplibre

* Put place, area and visit info into side panel

* Update API to use safe settings config method

* Fix specs

* Fix method name to config in SafeSettings and update usages accordingly

* Add missing public files

* Add handling for real time points

* Fix remembering enabled/disabled layers of the v2 map

* Fix lots of e2e tests

* Add settings to select map version

* Use maps/v2 as main path for MapLibre maps

* Update routing

* Update live mode

* Update maplibre controller

* Update changelog

* Remove some console.log statements

---------

Co-authored-by: Robin Tuszik <mail@robin.gg>

* Remove esbuild scripts from package.json

* Remove sideEffects field from package.json

* Raw data archivation

* Add tests

* Fix tests

* Fix tests

* Update ExceptionReporter

* Add schedule to run raw data archival job monthly

* Change file structure for raw data archival feature

* Update changelog and version for raw data archival feature

---------

Co-authored-by: Robin Tuszik <mail@robin.gg>

* Set raw_data to an empty hash instead of nil when archiving

* Fix storage configuration and file extraction

* Consider MIN_MINUTES_SPENT_IN_CITY during stats calculation (#2018)

* Consider MIN_MINUTES_SPENT_IN_CITY during stats calculation

* Remove raw data from visited cities api endpoint

* Use user timezone to show dates on maps (#2020)

* Fix/pre epoch time (#2019)

* Use user timezone to show dates on maps

* Limit timestamps to valid range to prevent database errors when users enter pre-epoch dates.

* Limit timestamps to valid range to prevent database errors when users enter pre-epoch dates.

* Fix tests failing due to new index on stats table

* Fix failing specs

* Update redis client configuration to support unix socket connection

* Update changelog

* Fix kml kmz import issues (#2023)

* Fix kml kmz import issues

* Refactor KML importer to improve readability and maintainability

* Implement moving points in map v2 and fix route rendering logic to ma… (#2027)

* Implement moving points in map v2 and fix route rendering logic to match map v1.

* Fix route spec

* fix(maplibre): update date format to ISO 8601 (#2029)

* Add verification step to raw data archival process (#2028)

* Add verification step to raw data archival process

* Add actual verification of raw data archives after creation, and only clear raw_data for verified archives.

* Fix failing specs

* Eliminate zip-bomb risk

* Fix potential memory leak in js

* Return .keep files

* Use Toast instead of alert for notifications

* Add help section to navbar dropdown

* Update changelog

* Remove raw_data_archival_job

* Ensure file is being closed properly after reading in Archivable concern

* Add composite index to stats table if not exists

* Update changelog

* Update entrypoint to always sync static assets (not only new ones)

* Add family layer to MapLibre maps (#2055)

* Add family layer to MapLibre maps

* Update migration

* Don't show family toggle if feature is disabled

* Update changelog

* Return changelog

* Update changelog

* Update tailwind file

* Bump sentry-rails from 6.0.0 to 6.1.0 (#1945)

Bumps [sentry-rails](https://github.com/getsentry/sentry-ruby) from 6.0.0 to 6.1.0.
- [Release notes](https://github.com/getsentry/sentry-ruby/releases)
- [Changelog](https://github.com/getsentry/sentry-ruby/blob/master/CHANGELOG.md)
- [Commits](https://github.com/getsentry/sentry-ruby/compare/6.0.0...6.1.0)

---
updated-dependencies:
- dependency-name: sentry-rails
  dependency-version: 6.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump turbo-rails from 2.0.17 to 2.0.20 (#1944)

Bumps [turbo-rails](https://github.com/hotwired/turbo-rails) from 2.0.17 to 2.0.20.
- [Release notes](https://github.com/hotwired/turbo-rails/releases)
- [Commits](https://github.com/hotwired/turbo-rails/compare/v2.0.17...v2.0.20)

---
updated-dependencies:
- dependency-name: turbo-rails
  dependency-version: 2.0.20
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Evgenii Burmakin <Freika@users.noreply.github.com>

* Bump webmock from 3.25.1 to 3.26.1 (#1943)

Bumps [webmock](https://github.com/bblimke/webmock) from 3.25.1 to 3.26.1.
- [Release notes](https://github.com/bblimke/webmock/releases)
- [Changelog](https://github.com/bblimke/webmock/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bblimke/webmock/compare/v3.25.1...v3.26.1)

---
updated-dependencies:
- dependency-name: webmock
  dependency-version: 3.26.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Evgenii Burmakin <Freika@users.noreply.github.com>

* Bump brakeman from 7.1.0 to 7.1.1 (#1942)

Bumps [brakeman](https://github.com/presidentbeef/brakeman) from 7.1.0 to 7.1.1.
- [Release notes](https://github.com/presidentbeef/brakeman/releases)
- [Changelog](https://github.com/presidentbeef/brakeman/blob/main/CHANGES.md)
- [Commits](https://github.com/presidentbeef/brakeman/compare/v7.1.0...v7.1.1)

---
updated-dependencies:
- dependency-name: brakeman
  dependency-version: 7.1.1
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump redis from 5.4.0 to 5.4.1 (#1941)

Bumps [redis](https://github.com/redis/redis-rb) from 5.4.0 to 5.4.1.
- [Changelog](https://github.com/redis/redis-rb/blob/master/CHANGELOG.md)
- [Commits](https://github.com/redis/redis-rb/compare/v5.4.0...v5.4.1)

---
updated-dependencies:
- dependency-name: redis
  dependency-version: 5.4.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Put import deletion into background job (#2045)

* Put import deletion into background job

* Update changelog

* fix null type error and update heatmap styling (#2037)

* fix: use constant weight for maplibre heatmap layer

* fix null type, update heatmap styling

* improve heatmap styling

* fix typo

* Fix stats calculation to recursively reduce H3 resolution when too ma… (#2065)

* Fix stats calculation to recursively reduce H3 resolution when too many hexagons are generated

* Update CHANGELOG.md

* Validate trip start and end dates (#2066)

* Validate trip start and end dates

* Update changelog

* Update migration to clean up duplicate stats before adding unique index

* Fix fog of war radius setting being ignored and applying settings causing errors (#2068)

* Update changelog

* Add Rack::Deflater middleware to config/application.rb to enable gzip compression for responses.

* Add composite index to points on user_id and timestamp

* Deduplicte points based on timestamp brought to unix time

* Fix/stats cache invalidation (#2072)

* Fix family layer toggle in Map v2 settings for non-selfhosted env

* Invalidate cache

* Remove comments

* Remove comment

* Add new indicies to improve performance and remove unused ones to opt… (#2078)

* Add new indicies to improve performance and remove unused ones to optimize database.

* Remove comments

* Update map search suggestions panel styling

* Add yearly digest (#2073)

* Add yearly digest

* Rename YearlyDigests to Users::Digests

* Minor changes

* Update yearly digest layout and styles

* Add flags and chart to email

* Update colors

* Fix layout of stats in yearly digest view

* Remove cron job for yearly digest scheduling

* Update CHANGELOG.md

* Update digest email setting handling

* Allow sharing digest for 1 week or 1 month

* Change Digests Distance to Bigint

* Fix settings page

* Update changelog

* Add RailsPulse (#2079)

* Add RailsPulse

* Add RailsPulse monitoring tool with basic HTTP authentication

* Bring points_count to integer

* Update migration and version

* Update rubocop issues

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Robin Tuszik <mail@robin.gg>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-30 17:33:56 +01:00

295 lines
13 KiB
Ruby

# frozen_string_literal: true
namespace :points do
namespace :raw_data do
desc 'Restore raw_data from archive to database for a specific month'
task :restore, %i[user_id year month] => :environment do |_t, args|
validate_args!(args)
user_id = args[:user_id].to_i
year = args[:year].to_i
month = args[:month].to_i
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Restoring raw_data to DATABASE'
puts " User: #{user_id} | Month: #{year}-#{format('%02d', month)}"
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
restorer = Points::RawData::Restorer.new
restorer.restore_to_database(user_id, year, month)
puts ''
puts '✓ Restoration complete!'
puts ''
puts "Points in #{year}-#{month} now have raw_data in database."
puts 'Run VACUUM ANALYZE points; to update statistics.'
end
desc 'Restore raw_data to memory/cache temporarily (for data migrations)'
task :restore_temporary, %i[user_id year month] => :environment do |_t, args|
validate_args!(args)
user_id = args[:user_id].to_i
year = args[:year].to_i
month = args[:month].to_i
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Loading raw_data into CACHE (temporary)'
puts " User: #{user_id} | Month: #{year}-#{format('%02d', month)}"
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
puts 'Data will be available for 1 hour via Point.raw_data_with_archive accessor'
puts ''
restorer = Points::RawData::Restorer.new
restorer.restore_to_memory(user_id, year, month)
puts ''
puts '✓ Cache loaded successfully!'
puts ''
puts 'You can now run your data migration.'
puts 'Example:'
puts " rails runner \"Point.where(user_id: #{user_id}, timestamp_year: #{year}, timestamp_month: #{month}).find_each { |p| p.fix_coordinates_from_raw_data }\""
puts ''
puts 'Cache will expire in 1 hour automatically.'
end
desc 'Restore all archived raw_data for a user'
task :restore_all, [:user_id] => :environment do |_t, args|
raise 'Usage: rake points:raw_data:restore_all[user_id]' unless args[:user_id]
user_id = args[:user_id].to_i
user = User.find(user_id)
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Restoring ALL archives for user'
puts " #{user.email} (ID: #{user_id})"
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
archives = Points::RawDataArchive.where(user_id: user_id)
.select(:year, :month)
.distinct
.order(:year, :month)
puts "Found #{archives.count} months to restore"
puts ''
archives.each_with_index do |archive, idx|
puts "[#{idx + 1}/#{archives.count}] Restoring #{archive.year}-#{format('%02d', archive.month)}..."
restorer = Points::RawData::Restorer.new
restorer.restore_to_database(user_id, archive.year, archive.month)
end
puts ''
puts "✓ All archives restored for user #{user_id}!"
end
desc 'Show archive statistics'
task status: :environment do
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Points raw_data Archive Statistics'
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
total_archives = Points::RawDataArchive.count
verified_archives = Points::RawDataArchive.where.not(verified_at: nil).count
unverified_archives = total_archives - verified_archives
total_points = Point.count
archived_points = Point.where(raw_data_archived: true).count
cleared_points = Point.where(raw_data_archived: true, raw_data: {}).count
archived_not_cleared = archived_points - cleared_points
percentage = total_points.positive? ? (archived_points.to_f / total_points * 100).round(2) : 0
puts "Archives: #{total_archives} (#{verified_archives} verified, #{unverified_archives} unverified)"
puts "Points archived: #{archived_points} / #{total_points} (#{percentage}%)"
puts "Points cleared: #{cleared_points}"
puts "Archived but not cleared: #{archived_not_cleared}"
puts ''
# Storage size via ActiveStorage
total_blob_size = ActiveStorage::Blob
.joins('INNER JOIN active_storage_attachments ON active_storage_attachments.blob_id = active_storage_blobs.id')
.where("active_storage_attachments.record_type = 'Points::RawDataArchive'")
.sum(:byte_size)
puts "Storage used: #{ActiveSupport::NumberHelper.number_to_human_size(total_blob_size)}"
puts ''
# Recent activity
recent = Points::RawDataArchive.where('archived_at > ?', 7.days.ago).count
puts "Archives created last 7 days: #{recent}"
puts ''
# Top users
puts 'Top 10 users by archive count:'
puts '─────────────────────────────────────────────────'
Points::RawDataArchive.group(:user_id)
.select('user_id, COUNT(*) as archive_count, SUM(point_count) as total_points')
.order('archive_count DESC')
.limit(10)
.each_with_index do |stat, idx|
user = User.find(stat.user_id)
puts "#{idx + 1}. #{user.email.ljust(30)} #{stat.archive_count.to_s.rjust(3)} archives, #{stat.total_points.to_s.rjust(8)} points"
end
puts ''
end
desc 'Verify archive integrity (all unverified archives, or specific month with args)'
task :verify, %i[user_id year month] => :environment do |_t, args|
verifier = Points::RawData::Verifier.new
if args[:user_id] && args[:year] && args[:month]
# Verify specific month
user_id = args[:user_id].to_i
year = args[:year].to_i
month = args[:month].to_i
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Verifying Archives'
puts " User: #{user_id} | Month: #{year}-#{format('%02d', month)}"
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
verifier.verify_month(user_id, year, month)
else
# Verify all unverified archives
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Verifying All Unverified Archives'
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
stats = verifier.call
puts ''
puts "Verified: #{stats[:verified]}"
puts "Failed: #{stats[:failed]}"
end
puts ''
puts '✓ Verification complete!'
end
desc 'Clear raw_data for verified archives (all verified, or specific month with args)'
task :clear_verified, %i[user_id year month] => :environment do |_t, args|
clearer = Points::RawData::Clearer.new
if args[:user_id] && args[:year] && args[:month]
# Clear specific month
user_id = args[:user_id].to_i
year = args[:year].to_i
month = args[:month].to_i
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Clearing Verified Archives'
puts " User: #{user_id} | Month: #{year}-#{format('%02d', month)}"
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
clearer.clear_month(user_id, year, month)
else
# Clear all verified archives
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Clearing All Verified Archives'
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
stats = clearer.call
puts ''
puts "Points cleared: #{stats[:cleared]}"
end
puts ''
puts '✓ Clearing complete!'
puts ''
puts 'Run VACUUM ANALYZE points; to reclaim space and update statistics.'
end
desc 'Archive raw_data for old data (2+ months old, does NOT clear yet)'
task archive: :environment do
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Archiving Raw Data (2+ months old data)'
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
puts 'This will archive points.raw_data for months 2+ months old.'
puts 'Raw data will NOT be cleared yet - use verify and clear_verified tasks.'
puts 'This is safe to run multiple times (idempotent).'
puts ''
stats = Points::RawData::Archiver.new.call
puts ''
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Archival Complete'
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
puts "Months processed: #{stats[:processed]}"
puts "Points archived: #{stats[:archived]}"
puts "Failures: #{stats[:failed]}"
puts ''
return unless stats[:archived].positive?
puts 'Next steps:'
puts '1. Verify archives: rake points:raw_data:verify'
puts '2. Clear verified data: rake points:raw_data:clear_verified'
puts '3. Check stats: rake points:raw_data:status'
end
desc 'Full workflow: archive + verify + clear (for automated use)'
task archive_full: :environment do
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' Full Archive Workflow'
puts ' (Archive → Verify → Clear)'
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
# Step 1: Archive
puts '▸ Step 1/3: Archiving...'
archiver_stats = Points::RawData::Archiver.new.call
puts " ✓ Archived #{archiver_stats[:archived]} points"
puts ''
# Step 2: Verify
puts '▸ Step 2/3: Verifying...'
verifier_stats = Points::RawData::Verifier.new.call
puts " ✓ Verified #{verifier_stats[:verified]} archives"
if verifier_stats[:failed].positive?
puts " ✗ Failed to verify #{verifier_stats[:failed]} archives"
puts ''
puts '⚠ Some archives failed verification. Data NOT cleared for safety.'
puts 'Please investigate failed archives before running clear_verified.'
exit 1
end
puts ''
# Step 3: Clear
puts '▸ Step 3/3: Clearing verified data...'
clearer_stats = Points::RawData::Clearer.new.call
puts " ✓ Cleared #{clearer_stats[:cleared]} points"
puts ''
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ' ✓ Full Archive Workflow Complete!'
puts '━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━'
puts ''
puts 'Run VACUUM ANALYZE points; to reclaim space.'
end
# Alias for backward compatibility
task initial_archive: :archive
end
end
def validate_args!(args)
return if args[:user_id] && args[:year] && args[:month]
raise 'Usage: rake points:raw_data:TASK[user_id,year,month]'
end