Вы используете устаревший браузер!
Страница может отображаться некорректно.
Our work fuses these strands, extending the literature (e.g., Li et al., 2022) into the cross‑category domain. 3. Category‑Spanning Retrieval Framework (CSRF) 3.1 Overview +-------------------+ +-------------------+ +-------------------+ | Structured DBs | ---> | Fusion Engine | ---> | Ranking Module | | (civil, tax, etc) | | (confidence | | (precision‑recall | +-------------------+ | weighting) | | trade‑off) | | +-------------------+ +-------------------+ v +-------------------+ +-------------------+ | Unstructured Text | ---> | Entity Extractor | | (tweets, blogs) | | (NER + fuzzy) | +-------------------+ +-------------------+ | v +-------------------+ +-------------------+ | Images / Video | ---> | Face/Person Re‑ID | | (Instagram, CCTV) | | (OSNet) | +-------------------+ +-------------------+ | v +-------------------+ +-------------------+ | Graph Signals | ---> | Linkage Analyzer | | (call‑graphs, | | (probabilistic | | co‑attendance) | | graph model) | +-------------------+ +-------------------+ 3.2 Data Sources | Category | Example Sources | Access Modality | |----------|----------------|-----------------| | Official Registers | Thai Civil Registration, Tax Office | Secure API (OAuth2, audit logs) | | Commercial Services | Grab, LINE Taxi, Foodpanda | Partner data‑share agreements | | Social Media | Instagram, Twitter, Facebook | Public endpoints + rate‑limited API | | Multimedia | CCTV (BMA), YouTube geotagged videos | Stream processing (Kafka) | | Relational Graphs | Mobile‑call records (anonymized), Event RSVP logs | Batch ETL pipelines |
[ S = \sum_i=1^N w_i \cdot c_i ,\quad \sum w_i = 1 ] Searching for- jessica bangkok in-All Categorie...
Ground‑truth “Jessica” entities (n = 274) are planted with varying visibility across categories (e.g., 38 % appear only in images). 5.1 Evaluation Metrics | Metric | Definition | |--------|------------| | Recall@k | Fraction of true Jessicas appearing in top‑k results. | | Precision@k | Fraction of top‑k results that are true Jessicas. | | Mean Query Latency | Wall‑clock time per query (average over 10 000 runs). | | Privacy‑Risk Score | Expected number of leaked attributes per query (lower is better). | 5.2 Baselines | Baseline | Description | |----------|-------------| | B1 – Structured‑Only | Queries only official registers. | | B2 – Text‑Only | Full‑text search on social media. | | B3 – Image‑Only | Face‑matching on CCTV feeds. | | B4 – Simple Union | Union of all category results without weighting. | 5.3 Results | System | Recall@10 | Precision@10 | Latency (s) | Privacy‑Risk | |--------|-----------|--------------|------------|--------------| | B1 | 0.31 | 0.94 | 0.42 | 0.02 | | B2 | 0.44 | 0.68 | 0.78 | 0.04 | | B3 | 0.38 | 0.71 | 1.96 | 0.06 | | B4 | 0.61 | 0.73 | 2.31 | 0.12 | | CSRF (proposed) | 0.71 | 0.85 | 2.24 | 0.09 | Our work fuses these strands, extending the literature (e
A. Researcher¹, B. Analyst², C. Data‑Scientist³ | | Privacy‑Risk Score | Expected number of
Searching for “Jessica” in Bangkok across All Categories: A Multi‑Modal Retrieval Study
Our work fuses these strands, extending the literature (e.g., Li et al., 2022) into the cross‑category domain. 3. Category‑Spanning Retrieval Framework (CSRF) 3.1 Overview +-------------------+ +-------------------+ +-------------------+ | Structured DBs | ---> | Fusion Engine | ---> | Ranking Module | | (civil, tax, etc) | | (confidence | | (precision‑recall | +-------------------+ | weighting) | | trade‑off) | | +-------------------+ +-------------------+ v +-------------------+ +-------------------+ | Unstructured Text | ---> | Entity Extractor | | (tweets, blogs) | | (NER + fuzzy) | +-------------------+ +-------------------+ | v +-------------------+ +-------------------+ | Images / Video | ---> | Face/Person Re‑ID | | (Instagram, CCTV) | | (OSNet) | +-------------------+ +-------------------+ | v +-------------------+ +-------------------+ | Graph Signals | ---> | Linkage Analyzer | | (call‑graphs, | | (probabilistic | | co‑attendance) | | graph model) | +-------------------+ +-------------------+ 3.2 Data Sources | Category | Example Sources | Access Modality | |----------|----------------|-----------------| | Official Registers | Thai Civil Registration, Tax Office | Secure API (OAuth2, audit logs) | | Commercial Services | Grab, LINE Taxi, Foodpanda | Partner data‑share agreements | | Social Media | Instagram, Twitter, Facebook | Public endpoints + rate‑limited API | | Multimedia | CCTV (BMA), YouTube geotagged videos | Stream processing (Kafka) | | Relational Graphs | Mobile‑call records (anonymized), Event RSVP logs | Batch ETL pipelines |
[ S = \sum_i=1^N w_i \cdot c_i ,\quad \sum w_i = 1 ]
Ground‑truth “Jessica” entities (n = 274) are planted with varying visibility across categories (e.g., 38 % appear only in images). 5.1 Evaluation Metrics | Metric | Definition | |--------|------------| | Recall@k | Fraction of true Jessicas appearing in top‑k results. | | Precision@k | Fraction of top‑k results that are true Jessicas. | | Mean Query Latency | Wall‑clock time per query (average over 10 000 runs). | | Privacy‑Risk Score | Expected number of leaked attributes per query (lower is better). | 5.2 Baselines | Baseline | Description | |----------|-------------| | B1 – Structured‑Only | Queries only official registers. | | B2 – Text‑Only | Full‑text search on social media. | | B3 – Image‑Only | Face‑matching on CCTV feeds. | | B4 – Simple Union | Union of all category results without weighting. | 5.3 Results | System | Recall@10 | Precision@10 | Latency (s) | Privacy‑Risk | |--------|-----------|--------------|------------|--------------| | B1 | 0.31 | 0.94 | 0.42 | 0.02 | | B2 | 0.44 | 0.68 | 0.78 | 0.04 | | B3 | 0.38 | 0.71 | 1.96 | 0.06 | | B4 | 0.61 | 0.73 | 2.31 | 0.12 | | CSRF (proposed) | 0.71 | 0.85 | 2.24 | 0.09 |
A. Researcher¹, B. Analyst², C. Data‑Scientist³
Searching for “Jessica” in Bangkok across All Categories: A Multi‑Modal Retrieval Study