Tag Localization - RFID Reads and Fixture Mappings
For our paper titled "Show Me the Money: RFID-based Article-to-Fixture Predictions for Fashion Retail Stores", we created multiple datasets consisting of RFID read-events in S0 and S1 during daily stocktaking processes, our calculated article to fixture mappings, as well as the corresponding ground truth. to compare retail and ecom buying behavior and enriched it with Smart Fitting Room data in the paper "Mind the Gap: Exploring Shopping Preferences Across Fashion Retail Channels". Specifically, the dataset contains the following real-world and lab data for the presented experiments:
- EPC to article to fixture mappings.
- Reference tag to fixture mappings.
- Raw RFID read event data with Timestamp, EPC, RSSI and RFID-reader session.
A more detailed description of the dataset can be found in our publication below. Note that the dataset is free to use for research purposes but requires citing our paper as the source of the data.
- M. Wölbitsch, T. Hasler, D. Helic and S. Walk (2020). Show Me the Money: RFID-based Article-to-Fixture Predictions for Fashion Retail Stores. In Proceedings of 14th IEEE International Conference on RFID (IEEE RFID 2020) [PDF] [Dataset]
Large-Scale Retail vs. Ecom Sales Dataset
We used this dataset to compare retail and ecom buying behavior and enriched it with Smart Fitting Room data in the paper "Mind the Gap: Exploring Shopping Preferences Across Fashion Retail Channels". The dataset contains the following data:
- E-com shopping baskets.
- Retail store shopping baskets.
- Smart Fitting Room fitting baskets (i.e., articles brought together into the fitting room).
A more detailed description of the dataset can be found in our publication below. Note that the dataset is free to use for research purposes but requires citing our paper as the source of the data.
- M. Wölbitsch, T. Hasler, S. Walk and D. Helic (2020). Mind the Gap: Exploring Shopping Preferences Across Fashion Retail Channels. In Proceedings of 28th ACM Conference on User Modeling, Adaptation and Personaization (UMAP 2020) [PDF] [Dataset]
Large-Scale Stocktake RFID Read-Events Dataset
This dataset (>2GB!) was introduced in the paper "RFID in the Wild - Analyzing Stocktake Data to Determine Detection Probabilities of Products", and consists of two parts:
- Information about the individual stocktakes.
- The read events of individual stocktakes.
Data Structure
Specifically, the dataset contains the following information about individual stocktakes in the sample_stocktakes.csv file:
- InventoryId: the unique identifier of a stocktake
- Store: the unique identifier of the store in which the stocktake was performed
- Region: the region in which the store is located in (i.e., US, Europe, or Asia)
- Expected: the number of items which were expected by the stock management system for the stocktake
- Unexpected: the number of items which were not expected by the stock management system for the stocktake
- Missing: the number of items which were expected by the stock management system but not found for the stocktake
- Actual: the number of items which were actually read during the stocktake
- TimeStampStart and TimeStampEnd: the timestamps in UTC time when the stocktake started/was finished
- Accuracy: the achieved accuracy of the stocktake (determined by the item quantities)
Furthermore, for each stocktake all recorded items are available as well in the sample_read_events.csv file. It contains:
- InventoryId: the unique identifier of a stocktake
- EpcSerial: the unique identifier of an item (EPC without company prefix)
- Product: the identifier of the product the item is associated with
A more detailed description of the dataset can be found in our publication below. Note that the dataset is free to use for research purposes but requires citing our paper as the source of the data.
- M. Wölbitsch, T. Hasler, M. Goller, C. Gütl, S. Walk and D. Helic (2019). RFID in the Wild - Analyzing Stocktake Data to Determine Detection Probabilities of Products. In Proceedings of 6th IEEE International Conference on Internet of Things: Systems, Management and Security IOTSMS2019 [PDF] [Dataset]
Narrative-Driven Recommendations Dataset
This dataset contains crowdsourced and manually curated annotations for submissions and comments to r/MovieSuggestions. Specifically, the annotations include movies (IMDb IDs), keywords, actors and genres for more than 1,400 submissions and 20,000 comments.
The dataset was generated for the purpose of analyzing narrative-driven recommendations, using data dumps available at pushshift.io/reddit/.
Data Structure
- submissions.csv: contains several different crowdsourced and manually curated annotations for movie suggestion requests on r/MovieSuggestion. Specifically, the file includes the reddit submission id, positively mentioned movie ids (IMDb), negatively mentioned movie ids (IMDb) as well as desired and undesired keywords, genres and actors.
- comments.csv: contains annotations for comments posted on r/MovieSuggestions. Each line in comments.csv contains the reddit submission is was posted under, the individual reddit comment id as well as the IMDb movie ids annotated in each comment.
- movie_titles.csv: includes a mapping between IMDb movie ids and their original titles (both found on IMDb)
A more detailed description of the dataset can be found in our publication below. Note that the dataset is free to use for research purposes but requires citing our paper as the source of the data.
- L. Eberhard, S. Walk, L. Posch and D. Helic (2019). Evaluating Narrative-Driven Movie Recommendations on Reddit. In Proceedings of 24th International Conference on Intelligent User Interfaces IUI2019. [PDF] [Dataset] [ACM DL]
RFID Tag Localization Dataset
This dataset includes CSVs with all read events we collected for the experiments conducted for our paper "Estimating Relative Tag Locations based on Time-Differences in Read Events".
Specifically, the dataset contains the following fields:
- experiment_id: identifier of the experiment
- group: groups experiments with the same properties (setup, experiment, tags, iterations)
- setup: "2d" for 2d-setup, "2da" for 2d-asymmetric-setup, "3d" for 3d-setup
- experiment: either "walking" or "random"
- tags: number of tags involved in the experiment
- iterations: number of iterations
- milliseconds: milliseconds since beginning of the experiment
- serial: the serial number extracted from the epc
- rssi: the measured rssi value
The corresponding ground truth dataset are located in the files 2d.npy, 2d_asymmetric.npy, and 3d.npy. The files contain the ground truth coordinates of the tags, relative to the tag with serial number 0.
The dataset is free to use for research purposes but requires citing our paper as the source of the data.
- T. Hasler, M. Wölbitsch, M. Goller and S. Walk (2019). Estimating Relative Tag Locations based on Time-Differences in Read Events. In Proceedings of 13th IEEE International Conference on RFID. [PDF] [Dataset]
Shopping-Baskets Dataset
The dataset consists of roughly half a million shopping baskets from 20 retail fashion stores located in four different cities. The data was collected between November 2016 and December 2018.
The dataset csv file contains the following fields:
- TransactionId: the transaction identifier, which can be used for grouping (i.e., generating shopping baskets)
- ProductId: the product identifier (anonymized product number)
- Date: the date on which a product was sold
- City: the city which a product was sold (anonymized)
When using the dataset please cite our paper as the source of the data.
- M. Wölbitsch, S. Walk, M. Goller and D. Helic (2019). Beggars Can't Be Choosers: Augmenting Sparse Data for Embedding-Based Product Recommendations in Retail Stores. In Proceedings of 27th ACM International Conference on User Modelling, Adaptation and Personalization UMAP2019. [PDF] [Dataset]