Martin Strobel


On the Privacy Risks of Model Explanations

R. Shokri, M. Strobel, Y. Zick
ArXiv preprint
Presented at the 2020 Workshop on Human Interpretability in Machine Learning (WHI)


Privacy and transparency are two key elements of trustworthy machine learning. Model explanations can provide more insight into a model's decisions on input data. This, however, can impose a significant privacy risk to the model's training set. We analyze whether an adversary can exploit model explanations to infer sensitive information about the model's training set. We investigate this research problem primarily using membership inference attacks: inferring whether a data point belongs to the training set of a model given its explanations. We study this problem for three popular types of model explanations: backpropagation-based, perturbation-based and example-based attribution methods. We devise membership inference attacks based on these model explanations, and extensively test them on a variety of datasets. We show that both backpropagation- and example-based explanations can leak a significant amount of information about individual data points in the training set. More importantly, we design reconstruction attacks against example-based model explanations, and use them to recover significant parts of the training set. Finally, we discuss the resistance of perturbation-based attribution methods to existing attack models; interestingly, their resistance to such attacks is related to a crucial shortcoming of such model explanations uncovered by recent works.


High Dimensional Model Explanations: an Axiomatic Approach

N. Patel, M. Strobel, Y. Zick
In Proceedings of the 4th ACM Conference on Fairness, Accountability and Transparency (FAccT 2021)
Also presented at the 2020 Workshop on Human Interpretability in Machine Learning (WHI)


Complex black-box machine learning models are regularly used in critical decision-making domains. This has given rise to several calls for algorithmic explainability. Many explanation algorithms proposed in literature assign importance to each feature individually. However, such explanations fail to capture the joint effects of sets of features. Indeed, few works so far formally analyze high dimensional model explanations. In this paper, we propose a novel high dimension model explanation method that captures the joint effect of feature subsets. We propose a new axiomatization for a generalization of the Banzhaf index; our method can also be thought of as an approximation of a black-box model by a higher-order polynomial. In other words, this work justifies the use of the generalized Banzhaf index as a model explanation by showing that it uniquely satisfies a set of natural desiderata and that it is the optimal local approximation of a black-box model. Our empirical evaluation of our measure highlights how it manages to capture desirable behavior, whereas other measures that do not satisfy our axioms behave in an unpredictable manner.


A Characterization of Monotone Influence Measures for Data Classification

J. Sliwinski, M. Strobel, Y. Zick
In Proceedings of the 33nd AAAI Conference on Artificial Intelligence (AAAI'19)


In this work we focus on the following question: how important was the i-th feature in determining the outcome for a given datapoint? We identify a family of influence measures; functions that, given a datapoint x, assign a value to every feature i, which roughly corresponds to that i's importance in determining the outcome for x. This family is uniquely derived from a set of axioms: desirable properties that any reasonable influence measure should satisfy. Departing from prior work on influence measures, we assume no knowledge of - or access to - the underlying classifier labelling the dataset. In other words, our influence measures are based on the dataset alone, and do not make any queries to the classifier. While this requirement naturally limits the scope of explanations we provide, we show that it is effective on real datasets.

Exploring the no-show paradox for Condorcet extensions using Ehrhart theory and computer simulations

F. Brandt, J. Hofbauer, and M. Strobel
In Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS) IFAAMAS, 2019.
Also: In Proceedings of the 7th International Workshop on Computational Social Choice (COMSOC), 2018.


Results from voting theory are increasingly used when dealing with collective decision making in computational multiagent systems. An important and surprising phenomenon in voting theory is the No-Show Paradox (NSP) , which occurs if a voter is better off by abstaining from an election. While it is known that certain voting rules suffer from this paradox in principle, the extent to which it is of practical concern is not well understood. We aim at filling this gap by analyzing the likelihood of the NSP for three Condorcet extensions (Black’s rule, MaxiMin, and Tideman’s rule) under various preference models using Ehrhart theory as well as extensive computer simulations. We find that, for few alternatives, the probability of the NSP is negligible (less than 1% for four alternatives and all considered preference models, except for Black’s rule). As the number of alternatives increases, the NSP becomes much more likely and which rule is most susceptible to abstention strongly depends on the underlying distribution of preferences.


Catching Captain Jack: Efficient time and space dependent patrols to combat oil-siphoning in international waters

B. An, F. Kong, M. Strobel, X. Wang
In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI'18), pp.208-215.


Pirate syndicates capturing tankers to siphon oil, causing an estimated cost of $5 billion a year, has become a serious security issue for maritime traffic. In response to the threat, coast guards and navies deploy patrol boats to protect international oil trade. However, given the vast area of the sea and the highly time and space dependent behaviors of both players, it remains a significant challenge to find efficient ways to deploy patrol resources. In this paper, we address the research challenges and provide four key contributions. First, we construct a Stackelberg model of the oil-siphoning problem based on incident reports of actual attacks; Second, we propose a compact formulation and a constraint generation algorithm, which tackle the exponentially growth of the defender’s and attacker’s strategy spaces, respectively, to compute efficient strategies of security agencies; Third, to further improve the scalability, we propose an abstraction method, which exploits the intrinsic similarity of defender’s strategy space, to solve extremely large-scale games; Finally, we evaluate our approaches through extensive simulations and a detailed case study with real ship traffic data. The results demonstrate that our approach achieves a dramatic improvement of scalability with modest influence on the solution quality and can scale up to realistic-sized problems.


Analyzing the practical relevance of voting paradoxes via Ehrhart theory, computer simulations, and empirical data

F. Brandt, C. Geist, and M. Strobel
In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 385–393. IFAAMAS, 2016.


Results from social choice theory are increasingly used to argue about collective decision making in computational multiagent systems. A large part of the social choice literature studies voting paradoxes in which seemingly mild properties are violated by common voting rules. In this paper, we investigate the likelihood of the Condorcet Loser Paradox (CLP) and the Agenda Contraction Paradox (ACP) using Ehrhart theory, computer simulations, and empirical data. We present the first analytical results for the CLP on four alternatives and show that our experimental results, which go well beyond four alternatives, are in almost perfect congruence with the analytical results. It turns out that the CLP—which is often cited as a major flaw of some Condorcet extensions such as Dodgson’s rule, Young’s rule, and MaxiMin—is of no practical relevance. The ACP, on the other hand, frequently occurs under various distributional assumptions about the voters’ preferences. The extent to which it is real threat, however, strongly depends on the voting rule, the underlying distribution of preferences, and, somewhat surprisingly, the parity of the number of voters.


Fractional hedonic games: Individual and group stability

F. Brandl, F. Brandt, and M. Strobel
In Proceedings of the 14th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 1219–1227. IFAAMAS, 2015.


Coalition formation provides a versatile framework for analyzing cooperative behavior in multi-agent systems. In particular, hedonic coalition formation has gained considerable attention in the literature. An interesting class of hedonicgames recently introduced by Aziz et al. are fractional hedonic games. In these games, the utility an agent assigns to a coalition is his average valuation for the members of his coalition. Three common notions of stability in hedonic games are core stability, Nash stability, and individual stability. For each of these notions we show that stable partitions may fail to exist in fractional hedonic games. For core stable partitions this holds even when all players only have symmetric zero/one valuations (“mutual friendship”). We then leverage these counter-examples to show that deciding the existence of stable partitions (and therefore also computing stable partitions) is NP-hard for all considered stability notions. Moreover, we show that checking whether the valuation functions of a fractional hedonic game induce strict preferences over coalitions is coNP-complete.

Privacy and Transparency aspects of Machine Learning

Algorithms play an ever-increasing role in our everyday lives, we let them decide which news we read, get recommendations about movies or job candidates from them and so on. Simultaneously, algorithms become more and more complex. The advent of machine learning and artificial intelligence led to algorithms based on amounts of data no human will ever be able to comprehend in a lifetime. This makes the resulting algorithms and their decisions incomprehensible to humans, we trust them mostly because they kind of seem to work well, not because we understand them. Algorithmic Transparency tries to solve this problem and attempts to make decisions made by algorithms understandable for humans.

Quantification of Voting Paradoxes

Computational Social Choice is plastered with impossibility results and examples of paradoxes showing that voting rules are flawed in one way or another. The majority of these results, however, tend to be theoretical e.g. showing that in a very specific, often unrealistic situation something bad might happen, i.e. under certain assumptions, it might be beneficial for a voter not to attend an election.

I'm interested in quantifying how often these paradoxes actually occur in practice?

Ethics in Artificial Intelligence and Machine Learning

As the societal impact of Machine Learning and Artifical Intelligence increases it becomes more and more obvious that these new technologies come, as most technological advances, with many ethical dilemmas. Moreover, and this seems to be a new dimension, we might soon see situations where the machines themselves face ethical dilemmas. The arguably most prominent example is derived from the famous Trolley dilemma. If a self-driving car is in a situation where an accident is unavoidable, but somehow has still some influence over who is getting hurt in the accident, how does it make the decision whom to hurt? A more mundane example would be: Should your personalized news feed show you pictures of your partner betraying you or hide this upsetting fact? Researchers have started discussing how to teach ethics to machines. I'm interested in the question if their background, i.e. research in AI, biases them in the way they want to solve this challenge.

Teaching Experience

  • Teaching Assistant for Machine Learning (Fall 2018, Fall 2020), NUS [CS3244]
  • Teaching Assistant for Introduction to Artificial Intelligence (Spring 2018, Spring 2019, Spring 2020, Summer 2020), NUS [CS3243]
  • Teaching Assistant for Software Engineering (Spring 2017), NTU
  • Tutor for Fundamentals of Algorithms and Data Structures (Summer 2015), TUM [IN0007]
  • Tutor for Introduction to Theory of Computation (Summer 2014), TUM [IN0011]
  • Mentor for Mathematics for Science and Economics (Fall 2013), TUM [MA9711]

Teaching Education

  • Teaching Assistant Programme, NTU [HWG702]
  • Advanced Didactical Topics for Tutors audited, TUM [ED0217]
  • Pedagogical Training in Didactics for Tutors, TUM [IN9028]
  • Teaching Science at University [Coursera]