NEWS TaPP 2024 will be jointly held with IEEE EuroS&P on July 12, 2024!

Intro

The Theory and Practice of Provenance workshop series was started in San Francisco in 2009. TaPP aims to be a venue for early-stage and innovative research ideas related to provenance, and a forum to encourage exchange of ideas between researchers working on provenance and practitioners or potential users of such research. Industry and academic participants interested in provenance in any setting are welcome, and workshop contributions describing unsolved problems or new potential application areas for provenance research are particularly welcome.

Previous TaPPs

Call for Papers

TaPP’24 inherits the tradition of providing a genuine workshop environment for discussing and developing new ideas and exploring connections between disciplines and between academic research on provenance and practical applications. We particularly welcome submissions covering security-related topics this year but continue to support diverse conversations from all disciplines, bringing together researchers from a wide range of computer science domains including workflows, databases, high performance computing, distributed systems, operating systems, programming languages, and software engineering, that have urgent provenance needs. Submission topics reflect the cross-cutting nature of this field, including, but not limited to:

Trust, transparency, and accountability
- Provenance and trustworthy systems
- Provenance to support accountability and transparency
Deployability, security, and forensic capabilities of provenance capture
- Standardization of provenance models and representations
- Retroactive reconstruction of provenance
- Integrating provenance information
Storage and query efficiency
- Analytics, querying, and reasoning about provenance
- Performance aspects of provenance capture, storage, and analytics
Accuracy, performance, and explainability of provenance analysis
- Security and privacy implications of provenance
- Visualizing provenance information
- Human interaction with provenance
Dataset and evaluation
- Generating provenance datasets and comparing provenance-based security solutions

Important Dates

Event	Date
Paper Submission Deadline	~~March 15, 2024~~ March 22, 2024
Author Notification	April 30, 2024
Camera-ready Due	May 15, 2024

Submission instructions

Research Track

We invite innovative and creative contributions, including papers outlining new challenges for provenance research, promising formal approaches to provenance, experiments, and visionary (and possibly risky) ideas. Research track papers can be submitted as:

Long papers: This track solicits full research papers that describe mature, high-quality research up to 8 pages, excluding references and other supporting material.
Short papers: This track typically include work in progress or visionary ideas up to 4 pages (excluding references and other supporting material) that are not yet fully developed, but have the potential to lead to cutting-edge research.

Application Track

Application track papers up to 6 pages (excluding references and other supporting material) should focus on innovative use of provenance, deployment of provenance-based solutions and open-source software. We invite authors to share insights, experience, and lessons learned when deploying provenance systems. We encourage submissions describing datasets, tools, or infrastructure that could benefit the community at large.

“Best of the Rest”

As the provenance community is now established and publishing in conferences and journals across computer science domains, TaPP is forming a “Best of the Rest” category. If you have had a provenance paper accepted to a conference or journal in 2023 through February 2024, we invite you to submit the abstract to TaPP. We will invite the “Best of the Rest” to present their work at TaPP to widen the conversation. Submit the abstract and original publication venue and date of the previously published work. The title of the abstract must begin with “BEST:”.

Regardless of the track, papers must be:

Not published or under review elsewhere
Formatted according to the conference format, following the IEEE EuroS&P conference proceedings templates (https://www.ieee-security.org/TC/EuroSP2023/eurosp-2023-template.zip)
Supporting material may be submitted, but the reviewers will not be obliged to read them. Maximum \(2\) pages for appendices and additional \(2\) pages for references are allowed
Submitted via EasyChair and updated at any time until the submission deadline

Workshop Schedule

Please register at the registration desk by the entrance of the building. The workshop will take place in Seminar Room 6.

Time	Event
08:00	Registration Opens
09:00 - 09:15	Opening and Welcome Remarks, Local Chair Bertram Ludäscher
09:15 - 10:15	Keynote (R. Sekar, Stony Brook University)
10:15 - 10:35	Morning Break
10:40 - 11:20	Session I: Capture
11:25 - 12:25	Session II: Application
12:30 - 13:30	Lunch
13:30 - 14:30	Invited Talk (Mohammad Reza Mousavi, King’s College London)
14:30 - 15:15	Session III: Usage
15:15 - 15:35	Afternoon Break
15:35 - 16:00	Town Hall and Conclusions

Session I: Capture

Abdullah Almuntashiri, Luis-Daniel Ibáñez and Adriane Chapman. LLMs for the Post-hoc Creation of Provenance.
Nils Hoffmann and Neda Ebrahimi Pour. A Low Overhead Approach for Automatically Tracking Provenance in Machine Learning Workflows.

Session II: Application

Wisam Abbasi, Adel Taweel and Andrea Saracino. A Provenance-driven Approach for Detecting Revenue Leakage in Telecom.
Michael Johnson, Marcus Paradies, Hans-Rainer Klöckner, Albina Muzafarova, Kristen Lackeos, David J. Champion, Marta Dembska and Sirko Schindler. Evaluation of Provenance Serialisations for Astronomical Provenance.
Tanja Auge, Laura Waltersdorfer, Emil Michels, Susanne Feistel, Susanne Jürgensmann, Fajar J. Ekaputra and Meike Klettke. Towards an Integrated Provenance Framework: A Scenario for Marine Data.

Session III: Usage

Shatha Algarni, Boris Glavic, Seokki Lee and Adriane Chapman. Solving Why Not Questions for Aggregate Constraints through Query Repair.
Yilin Xia, Shawn Bowers and Bertram Ludäscher. On the Structure of Game Provenance and its Applications.

Keynote and Invited Talks

Prof. R. Sekar, Stony Brook University, Provenance Collection, Storage and Analysis for Cyber Attack Investigation: Challenges and Recent Advances

There is wide consensus in the research community that system-call level provenance is essential for detecting and understanding advanced cyberattack campaigns, otherwise called APT campaigns. Unfortunately, provenance collection and analysis at system call granularity poses several major challenges. First, existing system call loggers incur high overheads, slowing down workloads by 2x to 8x. By exploiting this weakness, APT actors can overwhelm loggers and cause them to drop the vast majority of system calls, thus providing an effective vehicle for evasion. Secondly, provenance logs can grow to hundreds of GBs per day per host. The costs of storing months of logs across all the hosts can strain the IT budgets of even the well-resourced organizations. Third, even when all the attack activity is present in the logs, it is very challenging to filter out the benign background activity to zoom in on the attack activity, since attacks typically account for a vanishingly small fraction of the logs. In this talk, we describe these challenges and present our approach for addressing them. Our provenance collection system, called eAudit, incorporates several new techniques to reduce the performance overheads to single digits. To tackle the data volume challenge, we have developed compact event encoding techniques that reduce log sizes by 10x. In addition, we devised novel techniques for identifying and eliminating redundant events, further reducing the number of events by 10x, while provably preserving forensic analysis results. Finally, we present our attack detection and APT campaign reconstruction techniques that achieve excellent results in the DARPA Transparent Computing program.

Prof. Mohammad Reza Mousavi, King’s College London, Causality for Trustworthiness

Causality is an ancient concept, which is central to much of the scientific methodology, as well as the legal and regulatory systems. In this talk, we first review some of our recent work on trust and trustworthiness for autonomous systems. Then, we discuss the concept of causality and present some recent applications of actual causality in autonomous systems, including applications in explainability and insurance law. We conclude by some remarks on the potential application of causality in provenance.

Organizers

Program Chairs

Xueyuan Han-Vanbastelaer, Wake Forest University, Co-Chair
Yuval Moskovitch, Ben-Gurion University of the Negev, Co-Chair
Bertram Ludäscher, University of Illinois Urbana-Champaign, Local Chair

Program Committee

Adam Bates, University of Illinois Urbana-Champaign
Ghita Berrada, King’s College London
Elisa Bertino, Purdue University
James Cheney, University of Edinburgh
David Eyers, University of Otago
Peng Fei, Stellar Cyber
Ashish Gehani, SRI International
Dongqi Han, Tsinghua University
Melanie Herschel, University of Stuttgart
Ahmad Humayun, Virginia Tech
Wajih UI Hassan, University of Virginia
Matteo Interlandi, Microsoft Research
Bertram Ludäscher, University of Illinois Urbana-Champaign
Tanu Malik, DePaul University
Thomas Pasquier, University of British Columbia
Adam Pocock, Oracle
R. Sekar, Stony Brook University
Pierre Senellart, Ecole Normale Superieure
Benjamin Ujcich, Georgetown University
Xiao Yu, Stellar Cyber
Jun Zeng, Huawei

TaPP Steering Committee

Adriane Chapman (chair)
James Cheney
Sara Cohen-Boulakia
Ashish Gehani
Melanie Herschel
Bill Howe
Bertram Ludäscher
Tanu Malik
Paolo Missier
Thomas Moyer
Thomas Pasquier
Sudeepa Roy
Daniel Deutch

Ethical Considerations

We adopt the ethical guidelines from IEEE S&P 2024. We report them here verbatim.

Ethical Considerations for Vulnerability Disclosure

Where research identifies a vulnerability (e.g., software vulnerabilities in a given program, design weaknesses in a hardware system, or any other kind of vulnerability in deployed systems), we expect that researchers act in a way that avoids gratuitous harm to affected users and, where possible, affirmatively protects those users. In nearly every case, disclosing the vulnerability to vendors of affected systems, and other stakeholders, will help protect users. It is the committee’s sense that a disclosure window of 45 days (see this link) to 90 days (see this link) ahead of publication is consistent with authors’ ethical obligations.

Longer disclosure windows (which may keep vulnerabilities from the public for extended periods of time) should only be considered in exceptional situations, e.g., if the affected parties have provided convincing evidence the vulnerabilities were previously unknown and the full rollout of mitigations requires additional time. The authors are encouraged to consult with the PC chairs in case of questions or concerns.

The version of the paper submitted for review must discuss in detail the steps the authors have taken or plan to take to address these vulnerabilities; but, consistent with the timelines above, the authors do not have to disclose vulnerabilities ahead of submission. If a paper raises significant ethical and/or legal concerns, it will be checked by the REC and it might be rejected based on these concerns. The PC chairs will be happy to consult with authors about how this policy applies to their submissions.

Ethical Considerations for Human Subjects Research

Submissions that describe experiments that could be viewed as involving human subjects, that analyze data derived from human subjects (even anonymized data), or that otherwise may put humans at risk should: - Disclose whether the research received an approval or waiver from each of the authors’ institutional ethics review boards (IRB) if applicable. - Discuss steps taken to ensure that participants and others who might have been affected by an experiment were treated ethically and with respect.

If a submission deals with any kind of personal identifiable information (PII) or other kinds of sensitive data, the version of the paper submitted for review must discuss in detail the steps the authors have taken to mitigate harms to the persons identified. If a paper raises significant ethical and/or legal concerns, it will be checked by the REC and it might be rejected based on these concerns. The PC chairs will be happy to consult with authors about how this policy applies to their submissions.

Contact

To contact us about the workshop please email the following address.

16th International Workshop on Theory and Practice of Provenance