mirror of
https://github.com/Freika/dawarich.git
synced 2026-01-10 01:01:39 -05:00
* fix: move foreman to global gems to fix startup crash (#1971) * Update exporting code to stream points data to file in batches to red… (#1980) * Update exporting code to stream points data to file in batches to reduce memory usage * Update changelog * Update changelog * Feature/maplibre frontend (#1953) * Add a plan to use MapLibre GL JS for the frontend map rendering, replacing Leaflet * Implement phase 1 * Phases 1-3 + part of 4 * Fix e2e tests * Phase 6 * Implement fog of war * Phase 7 * Next step: fix specs, phase 7 done * Use our own map tiles * Extract v2 map logic to separate manager classes * Update settings panel on v2 map * Update v2 e2e tests structure * Reimplement location search in maps v2 * Update speed routes * Implement visits and places creation in v2 * Fix last failing test * Implement visits merging * Fix a routes e2e test and simplify the routes layer styling. * Extract js to modules from maps_v2_controller.js * Implement area creation * Fix spec problem * Fix some e2e tests * Implement live mode in v2 map * Update icons and panel * Extract some styles * Remove unused file * Start adding dark theme to popups on MapLibre maps * Make popups respect dark theme * Move v2 maps to maplibre namespace * Update v2 references to maplibre * Put place, area and visit info into side panel * Update API to use safe settings config method * Fix specs * Fix method name to config in SafeSettings and update usages accordingly * Add missing public files * Add handling for real time points * Fix remembering enabled/disabled layers of the v2 map * Fix lots of e2e tests * Add settings to select map version * Use maps/v2 as main path for MapLibre maps * Update routing * Update live mode * Update maplibre controller * Update changelog * Remove some console.log statements * Pull only necessary data for map v2 points * Feature/raw data archive (#2009) * 0.36.2 (#2007) * fix: move foreman to global gems to fix startup crash (#1971) * Update exporting code to stream points data to file in batches to red… (#1980) * Update exporting code to stream points data to file in batches to reduce memory usage * Update changelog * Update changelog * Feature/maplibre frontend (#1953) * Add a plan to use MapLibre GL JS for the frontend map rendering, replacing Leaflet * Implement phase 1 * Phases 1-3 + part of 4 * Fix e2e tests * Phase 6 * Implement fog of war * Phase 7 * Next step: fix specs, phase 7 done * Use our own map tiles * Extract v2 map logic to separate manager classes * Update settings panel on v2 map * Update v2 e2e tests structure * Reimplement location search in maps v2 * Update speed routes * Implement visits and places creation in v2 * Fix last failing test * Implement visits merging * Fix a routes e2e test and simplify the routes layer styling. * Extract js to modules from maps_v2_controller.js * Implement area creation * Fix spec problem * Fix some e2e tests * Implement live mode in v2 map * Update icons and panel * Extract some styles * Remove unused file * Start adding dark theme to popups on MapLibre maps * Make popups respect dark theme * Move v2 maps to maplibre namespace * Update v2 references to maplibre * Put place, area and visit info into side panel * Update API to use safe settings config method * Fix specs * Fix method name to config in SafeSettings and update usages accordingly * Add missing public files * Add handling for real time points * Fix remembering enabled/disabled layers of the v2 map * Fix lots of e2e tests * Add settings to select map version * Use maps/v2 as main path for MapLibre maps * Update routing * Update live mode * Update maplibre controller * Update changelog * Remove some console.log statements --------- Co-authored-by: Robin Tuszik <mail@robin.gg> * Remove esbuild scripts from package.json * Remove sideEffects field from package.json * Raw data archivation * Add tests * Fix tests * Fix tests * Update ExceptionReporter * Add schedule to run raw data archival job monthly * Change file structure for raw data archival feature * Update changelog and version for raw data archival feature --------- Co-authored-by: Robin Tuszik <mail@robin.gg> * Set raw_data to an empty hash instead of nil when archiving * Fix storage configuration and file extraction * Consider MIN_MINUTES_SPENT_IN_CITY during stats calculation (#2018) * Consider MIN_MINUTES_SPENT_IN_CITY during stats calculation * Remove raw data from visited cities api endpoint * Use user timezone to show dates on maps (#2020) * Fix/pre epoch time (#2019) * Use user timezone to show dates on maps * Limit timestamps to valid range to prevent database errors when users enter pre-epoch dates. * Limit timestamps to valid range to prevent database errors when users enter pre-epoch dates. * Fix tests failing due to new index on stats table * Fix failing specs * Update redis client configuration to support unix socket connection * Update changelog * Fix kml kmz import issues (#2023) * Fix kml kmz import issues * Refactor KML importer to improve readability and maintainability * Implement moving points in map v2 and fix route rendering logic to ma… (#2027) * Implement moving points in map v2 and fix route rendering logic to match map v1. * Fix route spec * fix(maplibre): update date format to ISO 8601 (#2029) * Add verification step to raw data archival process (#2028) * Add verification step to raw data archival process * Add actual verification of raw data archives after creation, and only clear raw_data for verified archives. * Fix failing specs * Eliminate zip-bomb risk * Fix potential memory leak in js * Return .keep files * Use Toast instead of alert for notifications * Add help section to navbar dropdown * Update changelog * Remove raw_data_archival_job * Ensure file is being closed properly after reading in Archivable concern * Add composite index to stats table if not exists * Update changelog * Update entrypoint to always sync static assets (not only new ones) * Add family layer to MapLibre maps (#2055) * Add family layer to MapLibre maps * Update migration * Don't show family toggle if feature is disabled * Update changelog * Return changelog * Update changelog * Update tailwind file * Bump sentry-rails from 6.0.0 to 6.1.0 (#1945) Bumps [sentry-rails](https://github.com/getsentry/sentry-ruby) from 6.0.0 to 6.1.0. - [Release notes](https://github.com/getsentry/sentry-ruby/releases) - [Changelog](https://github.com/getsentry/sentry-ruby/blob/master/CHANGELOG.md) - [Commits](https://github.com/getsentry/sentry-ruby/compare/6.0.0...6.1.0) --- updated-dependencies: - dependency-name: sentry-rails dependency-version: 6.1.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump turbo-rails from 2.0.17 to 2.0.20 (#1944) Bumps [turbo-rails](https://github.com/hotwired/turbo-rails) from 2.0.17 to 2.0.20. - [Release notes](https://github.com/hotwired/turbo-rails/releases) - [Commits](https://github.com/hotwired/turbo-rails/compare/v2.0.17...v2.0.20) --- updated-dependencies: - dependency-name: turbo-rails dependency-version: 2.0.20 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Evgenii Burmakin <Freika@users.noreply.github.com> * Bump webmock from 3.25.1 to 3.26.1 (#1943) Bumps [webmock](https://github.com/bblimke/webmock) from 3.25.1 to 3.26.1. - [Release notes](https://github.com/bblimke/webmock/releases) - [Changelog](https://github.com/bblimke/webmock/blob/master/CHANGELOG.md) - [Commits](https://github.com/bblimke/webmock/compare/v3.25.1...v3.26.1) --- updated-dependencies: - dependency-name: webmock dependency-version: 3.26.1 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Evgenii Burmakin <Freika@users.noreply.github.com> * Bump brakeman from 7.1.0 to 7.1.1 (#1942) Bumps [brakeman](https://github.com/presidentbeef/brakeman) from 7.1.0 to 7.1.1. - [Release notes](https://github.com/presidentbeef/brakeman/releases) - [Changelog](https://github.com/presidentbeef/brakeman/blob/main/CHANGES.md) - [Commits](https://github.com/presidentbeef/brakeman/compare/v7.1.0...v7.1.1) --- updated-dependencies: - dependency-name: brakeman dependency-version: 7.1.1 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump redis from 5.4.0 to 5.4.1 (#1941) Bumps [redis](https://github.com/redis/redis-rb) from 5.4.0 to 5.4.1. - [Changelog](https://github.com/redis/redis-rb/blob/master/CHANGELOG.md) - [Commits](https://github.com/redis/redis-rb/compare/v5.4.0...v5.4.1) --- updated-dependencies: - dependency-name: redis dependency-version: 5.4.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Put import deletion into background job (#2045) * Put import deletion into background job * Update changelog * fix null type error and update heatmap styling (#2037) * fix: use constant weight for maplibre heatmap layer * fix null type, update heatmap styling * improve heatmap styling * fix typo * Fix stats calculation to recursively reduce H3 resolution when too ma… (#2065) * Fix stats calculation to recursively reduce H3 resolution when too many hexagons are generated * Update CHANGELOG.md * Validate trip start and end dates (#2066) * Validate trip start and end dates * Update changelog * Update migration to clean up duplicate stats before adding unique index * Fix fog of war radius setting being ignored and applying settings causing errors (#2068) * Update changelog * Add Rack::Deflater middleware to config/application.rb to enable gzip compression for responses. * Add composite index to points on user_id and timestamp * Deduplicte points based on timestamp brought to unix time * Fix/stats cache invalidation (#2072) * Fix family layer toggle in Map v2 settings for non-selfhosted env * Invalidate cache * Remove comments * Remove comment * Add new indicies to improve performance and remove unused ones to opt… (#2078) * Add new indicies to improve performance and remove unused ones to optimize database. * Remove comments * Update map search suggestions panel styling * Add yearly digest (#2073) * Add yearly digest * Rename YearlyDigests to Users::Digests * Minor changes * Update yearly digest layout and styles * Add flags and chart to email * Update colors * Fix layout of stats in yearly digest view * Remove cron job for yearly digest scheduling * Update CHANGELOG.md * Update digest email setting handling * Allow sharing digest for 1 week or 1 month * Change Digests Distance to Bigint * Fix settings page * Update changelog * Add RailsPulse (#2079) * Add RailsPulse * Add RailsPulse monitoring tool with basic HTTP authentication * Bring points_count to integer * Update migration and version * Update rubocop issues * Fix migrations and data verification to remove safety_assured blocks and handle missing points gracefully. * Update version --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Robin Tuszik <mail@robin.gg> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
199 lines
6.6 KiB
Ruby
199 lines
6.6 KiB
Ruby
# frozen_string_literal: true
|
|
|
|
module Points
|
|
module RawData
|
|
class Verifier
|
|
def initialize
|
|
@stats = { verified: 0, failed: 0 }
|
|
end
|
|
|
|
def call
|
|
Rails.logger.info('Starting raw_data archive verification...')
|
|
|
|
unverified_archives.find_each do |archive|
|
|
verify_archive(archive)
|
|
end
|
|
|
|
Rails.logger.info("Verification complete: #{@stats}")
|
|
@stats
|
|
end
|
|
|
|
def verify_specific_archive(archive_id)
|
|
archive = Points::RawDataArchive.find(archive_id)
|
|
verify_archive(archive)
|
|
end
|
|
|
|
def verify_month(user_id, year, month)
|
|
archives = Points::RawDataArchive.for_month(user_id, year, month)
|
|
.where(verified_at: nil)
|
|
|
|
Rails.logger.info("Verifying #{archives.count} archives for #{year}-#{format('%02d', month)}...")
|
|
|
|
archives.each { |archive| verify_archive(archive) }
|
|
end
|
|
|
|
private
|
|
|
|
def unverified_archives
|
|
Points::RawDataArchive.where(verified_at: nil)
|
|
end
|
|
|
|
def verify_archive(archive)
|
|
Rails.logger.info("Verifying archive #{archive.id} (#{archive.month_display}, chunk #{archive.chunk_number})...")
|
|
|
|
verification_result = perform_verification(archive)
|
|
|
|
if verification_result[:success]
|
|
archive.update!(verified_at: Time.current)
|
|
@stats[:verified] += 1
|
|
Rails.logger.info("✓ Archive #{archive.id} verified successfully")
|
|
else
|
|
@stats[:failed] += 1
|
|
Rails.logger.error("✗ Archive #{archive.id} verification failed: #{verification_result[:error]}")
|
|
ExceptionReporter.call(
|
|
StandardError.new(verification_result[:error]),
|
|
"Archive verification failed for archive #{archive.id}"
|
|
)
|
|
end
|
|
rescue StandardError => e
|
|
@stats[:failed] += 1
|
|
ExceptionReporter.call(e, "Failed to verify archive #{archive.id}")
|
|
Rails.logger.error("✗ Archive #{archive.id} verification error: #{e.message}")
|
|
end
|
|
|
|
def perform_verification(archive)
|
|
# 1. Verify file exists and is attached
|
|
unless archive.file.attached?
|
|
return { success: false, error: 'File not attached' }
|
|
end
|
|
|
|
# 2. Verify file can be downloaded
|
|
begin
|
|
compressed_content = archive.file.blob.download
|
|
rescue StandardError => e
|
|
return { success: false, error: "File download failed: #{e.message}" }
|
|
end
|
|
|
|
# 3. Verify file size is reasonable
|
|
if compressed_content.bytesize.zero?
|
|
return { success: false, error: 'File is empty' }
|
|
end
|
|
|
|
# 4. Verify MD5 checksum (if blob has checksum)
|
|
if archive.file.blob.checksum.present?
|
|
calculated_checksum = Digest::MD5.base64digest(compressed_content)
|
|
if calculated_checksum != archive.file.blob.checksum
|
|
return { success: false, error: 'MD5 checksum mismatch' }
|
|
end
|
|
end
|
|
|
|
# 5. Verify file can be decompressed and is valid JSONL, extract data
|
|
begin
|
|
archived_data = decompress_and_extract_data(compressed_content)
|
|
rescue StandardError => e
|
|
return { success: false, error: "Decompression/parsing failed: #{e.message}" }
|
|
end
|
|
|
|
point_ids = archived_data.keys
|
|
|
|
# 6. Verify point count matches
|
|
if point_ids.count != archive.point_count
|
|
return {
|
|
success: false,
|
|
error: "Point count mismatch: expected #{archive.point_count}, found #{point_ids.count}"
|
|
}
|
|
end
|
|
|
|
# 7. Verify point IDs checksum matches
|
|
calculated_checksum = calculate_checksum(point_ids)
|
|
if calculated_checksum != archive.point_ids_checksum
|
|
return { success: false, error: 'Point IDs checksum mismatch' }
|
|
end
|
|
|
|
# 8. Check which points still exist in database (informational only)
|
|
existing_count = Point.where(id: point_ids).count
|
|
if existing_count != point_ids.count
|
|
Rails.logger.info(
|
|
"Archive #{archive.id}: #{point_ids.count - existing_count} points no longer in database " \
|
|
"(#{existing_count}/#{point_ids.count} remaining). This is OK if user deleted their data."
|
|
)
|
|
end
|
|
|
|
# 9. Verify archived raw_data matches current database raw_data (only for existing points)
|
|
if existing_count.positive?
|
|
verification_result = verify_raw_data_matches(archived_data)
|
|
return verification_result unless verification_result[:success]
|
|
else
|
|
Rails.logger.info(
|
|
"Archive #{archive.id}: Skipping raw_data verification - no points remain in database"
|
|
)
|
|
end
|
|
|
|
{ success: true }
|
|
end
|
|
|
|
def decompress_and_extract_data(compressed_content)
|
|
io = StringIO.new(compressed_content)
|
|
gz = Zlib::GzipReader.new(io)
|
|
archived_data = {}
|
|
|
|
gz.each_line do |line|
|
|
data = JSON.parse(line)
|
|
archived_data[data['id']] = data['raw_data']
|
|
end
|
|
|
|
gz.close
|
|
archived_data
|
|
end
|
|
|
|
def verify_raw_data_matches(archived_data)
|
|
# For small archives, verify all points. For large archives, sample up to 100 points.
|
|
# Always verify all if 100 or fewer points for maximum accuracy
|
|
if archived_data.size <= 100
|
|
point_ids_to_check = archived_data.keys
|
|
else
|
|
point_ids_to_check = archived_data.keys.sample(100)
|
|
end
|
|
|
|
# Filter to only check points that still exist in the database
|
|
existing_point_ids = Point.where(id: point_ids_to_check).pluck(:id)
|
|
|
|
if existing_point_ids.empty?
|
|
# No points remain to verify, but that's OK
|
|
Rails.logger.info("No points remaining to verify raw_data matches")
|
|
return { success: true }
|
|
end
|
|
|
|
mismatches = []
|
|
|
|
Point.where(id: existing_point_ids).find_each do |point|
|
|
archived_raw_data = archived_data[point.id]
|
|
current_raw_data = point.raw_data
|
|
|
|
# Compare the raw_data (both should be hashes)
|
|
if archived_raw_data != current_raw_data
|
|
mismatches << {
|
|
point_id: point.id,
|
|
archived: archived_raw_data,
|
|
current: current_raw_data
|
|
}
|
|
end
|
|
end
|
|
|
|
if mismatches.any?
|
|
return {
|
|
success: false,
|
|
error: "Raw data mismatch detected in #{mismatches.count} point(s). " \
|
|
"First mismatch: Point #{mismatches.first[:point_id]}"
|
|
}
|
|
end
|
|
|
|
{ success: true }
|
|
end
|
|
|
|
def calculate_checksum(point_ids)
|
|
Digest::SHA256.hexdigest(point_ids.sort.join(','))
|
|
end
|
|
end
|
|
end
|
|
end
|