Package: markovDP 0.99.0
markovDP: Infrastructure for Discrete-Time Markov Decision Processes (MDP)
Provides the infrastructure to work with Markov Decision Processes (MDPs) in R. The focus is on convenience in formulating MDPs, the support of sparse representations (using sparse matrices, lists and data.frames) and visualization of results. Some key components are implemented in C++ to speed up computation. Several popular solvers are implemented.
Authors:
markovDP_0.99.0.tar.gz
markovDP_0.99.0.zip(r-4.5)markovDP_0.99.0.zip(r-4.4)markovDP_0.99.0.zip(r-4.3)
markovDP_0.99.0.tgz(r-4.5-x86_64)markovDP_0.99.0.tgz(r-4.5-arm64)markovDP_0.99.0.tgz(r-4.4-x86_64)markovDP_0.99.0.tgz(r-4.4-arm64)markovDP_0.99.0.tgz(r-4.3-x86_64)markovDP_0.99.0.tgz(r-4.3-arm64)
markovDP_0.99.0.tar.gz(r-4.5-noble)markovDP_0.99.0.tar.gz(r-4.4-noble)
markovDP_0.99.0.tgz(r-4.4-emscripten)markovDP_0.99.0.tgz(r-4.3-emscripten)
|markovDP.html✨
markovDP/json (API)
NEWS
# Install 'markovDP' in R: |
install.packages('markovDP', repos = c('https://mhahsler.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/mhahsler/markovdp/issues
- Cliff_walking - Cliff Walking Gridworld MDP
- DynaMaze - The Dyna Maze
- Maze - Steward Russell's 4x3 Maze Gridworld MDP
- Windy_gridworld - Windy Gridworld MDP Windy Gridworld MDP
control-theorymarkov-decision-processoptimizationcpp
Last updated 9 hours agofrom:35cd802f83. Checks:4 ERROR, 7 WARNING. Indexed: yes.
Target | Result | Latest binary |
---|---|---|
Doc / Vignettes | FAIL | Feb 20 2025 |
R-4.5-win-x86_64 | WARNING | Feb 20 2025 |
R-4.5-mac-x86_64 | WARNING | Feb 20 2025 |
R-4.5-mac-aarch64 | WARNING | Feb 20 2025 |
R-4.5-linux-x86_64 | WARNING | Feb 20 2025 |
R-4.4-win-x86_64 | WARNING | Feb 20 2025 |
R-4.4-mac-x86_64 | WARNING | Feb 20 2025 |
R-4.4-mac-aarch64 | WARNING | Feb 20 2025 |
R-4.3-win-x86_64 | ERROR | Feb 20 2025 |
R-4.3-mac-x86_64 | ERROR | Feb 20 2025 |
R-4.3-mac-aarch64 | ERROR | Feb 20 2025 |
Exports:Aabsorbing_statesactactionaction_discrepancyadd_linear_approx_Q_functionadd_policyapprox_greedy_actionapprox_greedy_policyapprox_Q_valueapprox_V_plotavailable_actionsbellman_operatorbellman_updatecolors_continuouscolors_discreteconvergence_horizoncreate_basis_coefscurve_multiple_directedfeatures2statefind_reachable_statesget_state_featuresgreedy_actiongreedy_policygw_animategw_initgw_matrixgw_maze_MDPgw_maze_MDPTFgw_pathgw_plotgw_plot_transition_graphgw_random_mazegw_rc2sgw_read_mazegw_s2rcgw_transition_probgw_transition_prob_end_stategw_transition_prob_namedgw_transition_prob_sparseinduced_reward_matrixinduced_transition_matrixis_converged_MDPis_solved_MDPmanual_policyMDPMDPTFnormalize_actionnormalize_action_idnormalize_action_labelnormalize_MDPnormalize_statenormalize_state_featuresnormalize_state_idnormalize_state_labelP_plot_transition_graphplot_value_functionpolicypolicy_evaluationpolicy_evaluation_LPQ_randomQ_valuesQ_zeroR_random_policyreachable_statesregretremove_unreachable_statesrewardreward_matrixround_stochasticsSsample_MDPsolve_MDPsolve_MDP_APPROXsolve_MDP_DPsolve_MDP_LPsolve_MDP_MCsolve_MDP_SAMPsolve_MDP_TDsolve_MDP_TDNstart_vectorstate2featurestransformation_fourier_basistransformation_linear_basistransformation_polynomial_basistransformation_RBF_basistransition_graphtransition_matrixunreachable_statesV_randomV_zerovalue_errorvalue_functionvisit_probability
Dependencies:clicodetoolscpp11crayonfastmapfloatforeachgluehmsigraphiteratorslatticelifecyclelpSolvemagrittrMatrixMatrixExtrapkgconfigprettyunitsprogressR6RcppRhpcBLASctlrlangvctrs
Gridworlds as MDPs
Rendered fromGridworlds.Rmd
usingknitr::rmarkdown
on Feb 20 2025.Last update: 2025-02-20
Started: 2025-02-20
Introduction to Discrete-Time Markov Decision Processes
Rendered frommarkovDP.Rmd
usingknitr::rmarkdown
on Feb 20 2025.Last update: 2025-02-20
Started: 2024-05-31
Solving MDPs with Linear Approximation
Rendered fromLinearApproximation.Rmd
usingknitr::rmarkdown
on Feb 20 2025.Last update: 2025-02-20
Started: 2025-02-20
Solving Tic-Tac-Toe as a MDP
Rendered fromTicTacToe.Rmd
usingknitr::rmarkdown
on Feb 20 2025.Last update: 2025-02-20
Started: 2024-09-01
Readme and manuals
Help Manual
Help page | Topics |
---|---|
Absorbing States | absorbing_states absorbing_states.MDP absorbing_states.MDPTF |
Perform an Action | act act.MDP act.MDPTF |
Choose an Action Given a Policy | action |
Conversions for Action and State IDs and Labels | action_state_helpers features2state get_state_features normalize_action normalize_action_id normalize_action_label normalize_state normalize_state_features normalize_state_id normalize_state_label s state2features |
Available Actions in a State | available_actions |
Bellman Update and Bellman operator | bellman_operator bellman_update |
Cliff Walking Gridworld MDP | Cliff_walking cliff_walking |
Default Colors for Visualization | colors colors_continuous colors_discrete |
Estimate the Convergence Horizon for an Infinite-Horizon MDP | convergence_horizon |
The Dyna Maze | DynaMaze dynamaze |
Find Reachable State Space from a Transition Model Function | find_reachable_states |
Greedy Actions and Policies | greedy_action greedy_policy |
Helper Functions for Gridworld MDPs | gridworld gw gw_animate gw_init gw_matrix gw_maze_MDP gw_maze_MDPTF gw_path gw_plot gw_plot_transition_graph gw_random_maze gw_rc2s gw_read_maze gw_s2rc gw_transition_prob gw_transition_prob_end_state gw_transition_prob_named gw_transition_prob_sparse |
Steward Russell's 4x3 Maze Gridworld MDP | Maze maze |
Define an MDP Problem | A is_converged_MDP is_solved_MDP MDP P_ R_ S |
Define an MDP as an Agent Environment | MDPTF |
Extract, Create Add a Policy to a Model | add_policy induced_reward_matrix induced_transition_matrix manual_policy policy random_policy |
Policy Evaluation | policy_evaluation policy_evaluation_LP |
Q-Values | Q_random Q_values Q_zero |
Find Reachable States | reachable_states reachable_states.function reachable_states.MDP reachable_states.MDPTF |
Regret of a Policy and Related Measures | action_discrepancy regret value_error |
Calculate the Expected Reward of a Policy | reward reward.MDP |
Round a stochastic vector or a row-stochastic matrix | round_stochastic |
Sample Trajectories from an MDP | sample_MDP sample_MDP.MDP |
Sample Trajectories from an MDPTF | sample_MDP.MDPTF |
Solve an MDP Problem | solve_MDP solve_MDP.MDP solve_MDP.MDPTF |
Episodic Semi-gradient Sarsa with Linear Function Approximation | add_linear_approx_Q_function add_linear_approx_Q_function.MDP add_linear_approx_Q_function.MDPTF approx_greedy_action approx_greedy_policy approx_Q_value approx_V_plot create_basis_coefs solve_MDP_APPROX transformation_fourier_basis transformation_linear_basis transformation_polynomial_basis transformation_RBF_basis |
Solve MDPs using Dynamic Programming | solve_MDP_DP |
Solve MDPs using Linear Programming | solve_MDP_LP |
Solve MDPs using Monte Carlo Control | solve_MDP_MC |
Solve MDPs using Random-Sampling | solve_MDP_SAMP |
Solve MDPs using Temporal Differencing | solve_MDP_TD solve_MDP_TDN |
Sample a Start State | start start.MDP start.MDPTF |
Transition Graph | curve_multiple_directed plot_transition_graph transition_graph |
Access to Parts of the Model Description | accessors normalize_MDP reward_matrix start_vector transition_matrix |
Unreachable States | remove_unreachable_states unreachable_states |
Value Function | plot_value_function value_function V_random V_zero |
State Visit Probability | visit_probability |
Windy Gridworld MDP Windy Gridworld MDP | Windy_gridworld windy_gridworld |