Technical Reports
A List by Author: Václav Brožek
- e-mail:
- xbrozek(a)fi.muni.cz
- home page:
- https://www.fi.muni.cz/~xbrozek/
Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes
by
Tomáš Brázdil,
Václav Brožek,
Krishnendu Chatterjee,
Vojtěch Forejt,
Antonín Kučera,
A full version of the paper presented at conference LICS 2011. April 2011, 32 pages.
FIMU-RS-2011-02.
Available as Postscript,
PDF.
Abstract:
We study Markov decision processes (MDPs) with multiple
limit-average (or mean-payoff) functions. We consider two different
objectives, namely, expectation and satisfaction objectives. Given
an MDP with k reward functions, in the expectation objective the
goal is to maximize the expected limit-average value, and in the satisfaction
objective the goal is to maximize the probability of runs such that
the limit-average value stays above a given vector.
We show that
under the expectation objective, in contrast to the single-objective
case, both randomization and memory are necessary for strategies,
and that finite-memory randomized strategies are sufficient.
Under the satisfaction objective, in contrast to the
single-objective case, infinite memory is necessary
for strategies, and that randomized memoryless strategies
are sufficient for epsilon-approximation, for all
epsilon. We further prove that
the decision problems for both expectation and satisfaction
objectives can be solved in polynomial time and the trade-off curve
(Pareto curve) can be epsilon-approximated in time polynomial
in the size of the MDP and 1/epsilon, and exponential
in the number of reward functions, for all epsilon>0. Our
results also reveal flaws in
previous work for MDPs with multiple
mean-payoff functions under the expectation objective, correct the
flaws and obtain improved results.
Qualitative Reachability in Stochastic BPA Games
by
Václav Brožek,
Tomáš Brázdil,
Antonín Kučera,
Jan Obdržálek,
A full version of the paper presented at STACS 2009. May 2009, 37 pages.
FIMU-RS-2009-01.
Available as Postscript,
PDF.
Abstract:
We consider a class of infinite-state stochastic games generated by stateless pushdown
automata (or, equivalently, 1-exit recursive state machines), where the winning
objective is specified by a regular set of target configurations and a qualitative
probability constraint ‘>0’ or ‘=1’. The goal of one player is to maximize the probability
of reaching the target set so that the constraint is satisfied, while the other
player aims at the opposite. We show that the winner in such games can be determined
in NP intersection co-NP. Further, we prove that the winning regions for both players
are regular, and we design algorithms which compute the associated finite-state automata.
Finally, we show that winning strategies can be synthesized effectively.
Discounted Properties of Probabilistic Pushdown Automata
by
Tomáš Brázdil,
Václav Brožek,
Jan Holeček,
Antonín Kučera,
A full version of the paper presented at LPAR 2008 September 2008, 31 pages.
FIMU-RS-2008-08.
Available as Postscript,
PDF.
Abstract:
We show that several basic discounted properties of probabilistic
pushdown automata related both to terminating and non-terminating runs
can be efficiently approximated up to an arbitrarily
small given precision.
Stochastic Games with Branching-Time Winning Objectives
by
Tomáš Brázdil,
Václav Brožek,
Vojtěch Forejt,
Antonín Kučera,
A full version of the paper presented at LICS 2006. September 2006, 37 pages.
FIMU-RS-2006-02.
Available as Postscript,
PDF.
Abstract:
We consider stochastic turn-based games where the
winning objectives are given by formulae of the branching-time
logic PCTL. These games are generally not determined and
winning strategies may require memory and/or randomization.
Our main results concern history-dependent strategies.
In particular, we show that the problem whether there
exists a history-dependent winning strategy
in 1.5-player games is highly undecidable, even
for objectives formulated in the
L(F^{=5/8},F^{=1},F^{>0},G^{=1}) fragment of
PCTL. On the other hand, we show that the problem becomes decidable
(and in fact EXPTIME-complete) for the
L(F^{=1},F^{>0},G^{=1}) fragment of PCTL, where winning
strategies require only finite memory. This result is tight in the
sense that winning strategies for
L(F^{=1},F^{>0},G^{=1},G^{>0}) objectives may
already require infinite memory.