Package: markovDP 0.99.0

markovDP: Infrastructure for Discrete-Time Markov Decision Processes (MDP)

Provides the infrastructure to work with Markov Decision Processes (MDPs) in R. The focus is on convenience in formulating MDPs, the support of sparse representations (using sparse matrices, lists and data.frames) and visualization of results. Some key components are implemented in C++ to speed up computation. Several popular solvers are implemented.

Authors:Michael Hahsler [aut, cph, cre]

markovDP_0.99.0.tar.gz
markovDP_0.99.0.zip(r-4.5)markovDP_0.99.0.zip(r-4.4)markovDP_0.99.0.zip(r-4.3)
markovDP_0.99.0.tgz(r-4.5-x86_64)markovDP_0.99.0.tgz(r-4.5-arm64)markovDP_0.99.0.tgz(r-4.4-x86_64)markovDP_0.99.0.tgz(r-4.4-arm64)markovDP_0.99.0.tgz(r-4.3-x86_64)markovDP_0.99.0.tgz(r-4.3-arm64)
markovDP_0.99.0.tar.gz(r-4.5-noble)markovDP_0.99.0.tar.gz(r-4.4-noble)
markovDP_0.99.0.tgz(r-4.4-emscripten)markovDP_0.99.0.tgz(r-4.3-emscripten)
markovDP.pdf |markovDP.html✨
markovDP/json (API)
NEWS

# Install 'markovDP' in R:

install.packages('markovDP', repos = c('https://mhahsler.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/mhahsler/markovdp/issues

Uses libs:

c++– GNU Standard C++ Library v3

Datasets:

Cliff_walking - Cliff Walking Gridworld MDP
DynaMaze - The Dyna Maze
Maze - Steward Russell's 4x3 Maze Gridworld MDP
Windy_gridworld - Windy Gridworld MDP Windy Gridworld MDP

On CRAN:

control-theory markov-decision-process optimization cpp

5.51 score 7 stars 4 scripts 101 exports 25 dependencies

Last updated 6 days agofrom:61eebb57c2. Checks:1 OK, 11 ERROR. Indexed: yes.

Target	Result	Latest binary
Doc / Vignettes	OK	Mar 13 2025
R-4.5-win-x86_64	ERROR	Mar 13 2025
R-4.5-mac-x86_64	ERROR	Mar 13 2025
R-4.5-mac-aarch64	ERROR	Mar 13 2025
R-4.5-linux-x86_64	ERROR	Mar 13 2025
R-4.4-win-x86_64	ERROR	Mar 13 2025
R-4.4-mac-x86_64	ERROR	Mar 13 2025
R-4.4-mac-aarch64	ERROR	Mar 13 2025
R-4.4-linux-x86_64	ERROR	Mar 13 2025
R-4.3-win-x86_64	ERROR	Mar 13 2025
R-4.3-mac-x86_64	ERROR	Mar 13 2025
R-4.3-mac-aarch64	ERROR	Mar 13 2025

Exports:A absorbing_states act action action_discrepancy add_linear_approx_Q_function add_policy approx_greedy_action approx_greedy_policy approx_Q_value approx_V_plot available_actions bellman_operator bellman_update colors_continuous colors_discrete convergence_horizon create_basis_coefs curve_multiple_directed features2state find_reachable_states get_state_features greedy_action greedy_policy gw_animate gw_init gw_matrix gw_maze_MDP gw_maze_MDPTF gw_path gw_plot gw_plot_transition_graph gw_random_maze gw_rc2s gw_read_maze gw_s2rc gw_transition_prob gw_transition_prob_end_state gw_transition_prob_named gw_transition_prob_sparse induced_reward_matrix induced_transition_matrix is_converged_MDP is_solved_MDP manual_policy MDP MDPTF normalize_action normalize_action_id normalize_action_label normalize_MDP normalize_state normalize_state_features normalize_state_id normalize_state_label P_plot_transition_graph plot_value_function policy policy_evaluation policy_evaluation_LP Q_random Q_values Q_zero R_random_policy reachable_states regret remove_unreachable_states reward reward_matrix round_stochastic s S sample_MDP schedule_exp schedule_exp2 schedule_harmonic schedule_linear schedule_log solve_MDP solve_MDP_APPROX solve_MDP_DP solve_MDP_LP solve_MDP_MC solve_MDP_SAMP solve_MDP_TD start_vector state2features transformation_fourier_basis transformation_linear_basis transformation_polynomial_basis transformation_RBF_basis transition_graph transition_matrix unreachable_states V_random V_zero value_error value_function visit_probability

Dependencies:cli codetools cpp11 crayon fastmap float foreach glue hms igraph iterators lattice lifecycle lpSolve magrittr Matrix MatrixExtra pkgconfig prettyunits progress R6 Rcpp RhpcBLASctl rlang vctrs

Gridworlds as MDPs

Michael Hahsler

Rendered fromGridworlds.Rmdusingknitr::rmarkdownon Mar 13 2025.

Last update: 2025-02-20
Started: 2025-02-20

Introduction to Discrete-Time Markov Decision Processes

Michael Hahsler

Rendered frommarkovDP.Rmdusingknitr::rmarkdownon Mar 13 2025.

Last update: 2025-03-06
Started: 2024-05-31

Solving MDPs with Linear Approximation

Michael Hahsler

Rendered fromLinearApproximation.Rmdusingknitr::rmarkdownon Mar 13 2025.

Last update: 2025-03-06
Started: 2025-02-20

Solving Tic-Tac-Toe as a MDP

Michael Hahsler

Rendered fromTicTacToe.Rmdusingknitr::rmarkdownon Mar 13 2025.

Last update: 2025-02-20
Started: 2024-09-01

Citation

Development and contributors

Readme and manuals

Help Manual

Help page	Topics
Absorbing States	absorbing_states absorbing_states.MDP absorbing_states.MDPTF
Perform an Action	act act.MDP act.MDPTF
Choose an Action Given a Policy	action
Conversions for Action and State IDs and Labels	action_state_helpers features2state get_state_features normalize_action normalize_action_id normalize_action_label normalize_state normalize_state_features normalize_state_id normalize_state_label s state2features
Available Actions in a State	available_actions
Bellman Update and Bellman operator	bellman_operator bellman_update
Cliff Walking Gridworld MDP	Cliff_walking cliff_walking
Default Colors for Visualization	colors colors_continuous colors_discrete
Estimate the Convergence Horizon for an Infinite-Horizon MDP	convergence_horizon
The Dyna Maze	DynaMaze dynamaze
Find Reachable State Space from a Transition Model Function	find_reachable_states
Greedy Actions and Policies	greedy_action greedy_policy
Helper Functions for Gridworld MDPs	gridworld gw gw_animate gw_init gw_matrix gw_maze_MDP gw_maze_MDPTF gw_path gw_plot gw_plot_transition_graph gw_random_maze gw_rc2s gw_read_maze gw_s2rc gw_transition_prob gw_transition_prob_end_state gw_transition_prob_named gw_transition_prob_sparse
Steward Russell's 4x3 Maze Gridworld MDP	Maze maze
Define an MDP Problem	A is_converged_MDP is_solved_MDP MDP P_ R_ S
Define an MDP as an Agent Environment	MDPTF
Extract, Create Add a Policy to a Model	add_policy induced_reward_matrix induced_transition_matrix manual_policy policy random_policy
Policy Evaluation	policy_evaluation policy_evaluation_LP
Q-Values	Q_random Q_values Q_zero
Find Reachable States	reachable_states reachable_states.function reachable_states.MDP reachable_states.MDPTF
Regret of a Policy and Related Measures	action_discrepancy regret value_error
Calculate the Expected Reward of a Policy	reward reward.MDP
Round a stochastic vector or a row-stochastic matrix	round_stochastic
Sample Trajectories from an MDP	sample_MDP sample_MDP.MDP
Sample Trajectories from an MDPTF	sample_MDP.MDPTF
Schedules to Reduce Alpha, Epsilon and Other Parameters	schedule schedule_exp schedule_exp2 schedule_harmonic schedule_linear schedule_log
Solve an MDP Problem	solve_MDP solve_MDP.MDP solve_MDP.MDPTF
Solve MDPs with Temporal Differencing and Linear Function Approximation	add_linear_approx_Q_function add_linear_approx_Q_function.MDP add_linear_approx_Q_function.MDPTF approx_greedy_action approx_greedy_policy approx_Q_value approx_V_plot solve_MDP_APPROX
Solve MDPs using Dynamic Programming	solve_MDP_DP
Solve MDPs using Linear Programming	solve_MDP_LP
Solve MDPs using Monte Carlo Control	solve_MDP_MC
Solve MDPs using Random-Sampling	solve_MDP_SAMP
Solve MDPs using Tabular Temporal Differencing	solve_MDP_TD
Sample a Start State	start start.MDP start.MDPTF
Transformation Functions for Linear Function Approximation	create_basis_coefs transformation transformation_fourier_basis transformation_linear_basis transformation_polynomial_basis transformation_RBF_basis
Transition Graph	curve_multiple_directed plot_transition_graph transition_graph
Access to Parts of the Model Description	accessors normalize_MDP reward_matrix start_vector transition_matrix
Unreachable States	remove_unreachable_states unreachable_states
Value Function	plot_value_function value_function V_random V_zero
State Visit Probability	visit_probability
Windy Gridworld MDP Windy Gridworld MDP	Windy_gridworld windy_gridworld

Package: markovDP 0.99.0

markovDP: Infrastructure for Discrete-Time Markov Decision Processes (MDP)

Gridworlds as MDPs

Introduction to Discrete-Time Markov Decision Processes

Solving MDPs with Linear Approximation

Solving Tic-Tac-Toe as a MDP

Citation

Development and contributors

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)