Package: markovDP 0.99.0
markovDP: Infrastructure for Discrete-Time Markov Decision Processes (MDP)
Provides the infrastructure to work with Markov Decision Processes (MDPs) in R. The focus is on convenience in formulating MDPs, the support of sparse representations (using sparse matrices, lists and data.frames) and visualization of results. Some key components are implemented in C++ to speed up computation. Several popular solvers are implemented.
Authors:
markovDP_0.99.0.tar.gz
markovDP_0.99.0.zip(r-4.7)markovDP_0.99.0.zip(r-4.6)markovDP_0.99.0.zip(r-4.5)
markovDP_0.99.0.tgz(r-4.6-x86_64)markovDP_0.99.0.tgz(r-4.6-arm64)markovDP_0.99.0.tgz(r-4.5-x86_64)markovDP_0.99.0.tgz(r-4.5-arm64)
markovDP_0.99.0.tar.gz(r-4.7-arm64)markovDP_0.99.0.tar.gz(r-4.7-x86_64)markovDP_0.99.0.tar.gz(r-4.6-arm64)markovDP_0.99.0.tar.gz(r-4.6-x86_64)
markovDP_0.99.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html✨
DESCRIPTION |NEWS
card.svg |card.png
markovDP/json (API)
| # Install 'markovDP' in R: |
| install.packages('markovDP', repos = c('https://mhahsler.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/mhahsler/markovdp/issues
- Cliff_walking - Cliff Walking Gridworld MDP
- DynaMaze - The Dyna Maze
- Maze - Steward Russell's 4x3 Maze Gridworld MDP
- Windy_gridworld - Windy Gridworld MDP Windy Gridworld MDP
control-theorymarkov-decision-processoptimizationcpp
Last updated from:e8ab3f595e. Checks:13 OK. Indexed: yes.
| Target | Result | Time | Files | Syslog |
|---|---|---|---|---|
| linux-devel-arm64 | OK | 208 | ||
| linux-devel-x86_64 | OK | 216 | ||
| source / vignettes | OK | 323 | ||
| linux-release-arm64 | OK | 234 | ||
| linux-release-x86_64 | OK | 253 | ||
| macos-release-arm64 | OK | 156 | ||
| macos-release-x86_64 | OK | 512 | ||
| macos-oldrel-arm64 | OK | 210 | ||
| macos-oldrel-x86_64 | OK | 389 | ||
| windows-devel | OK | 220 | ||
| windows-release | OK | 211 | ||
| windows-oldrel | OK | 212 | ||
| wasm-release | OK | 1276 |
Exports:Aabsorbing_statesactactionaction_discrepancyadd_policyapprox_greedy_actionapprox_greedy_policyapprox_Q_valueapprox_V_plotapprox_valueavailable_actionsbellman_operatorbellman_updatecolors_continuouscolors_discreteconvergence_horizoncreate_basis_coefscurve_multiple_directedfeatures2statefind_reachable_statesget_state_featuresgreedy_actiongreedy_policygw_animategw_initgw_matrixgw_maze_MDPgw_pathgw_plotgw_plot_transition_graphgw_random_mazegw_rc2sgw_read_mazegw_s2rcgw_transition_modelgw_transition_model_end_stategw_transition_model_namedgw_transition_model_sparseinduced_reward_matrixinduced_transition_matrixis_converged_MDPis_solved_MDPmanual_policyMDPMDPSamplenormalize_actionnormalize_action_idnormalize_action_labelnormalize_MDPnormalize_statenormalize_state_featuresnormalize_state_idnormalize_state_labelP_pi_approx_linearplot_transition_graphplot_value_functionpolicypolicy_evaluationpolicy_evaluation_bellmanpolicy_evaluation_LPpolicy_evaluation_MCq_approx_linearQ_randomQ_valuesQ_zeroR_random_policyreachable_statesregretremove_unreachable_statesrewardreward_matrixround_stochasticsSsample_MDPschedule_expschedule_exp2schedule_harmonicschedule_linearschedule_logsolve_MDPsolve_MDP_APPROXsolve_MDP_DPsolve_MDP_LPsolve_MDP_MCsolve_MDP_PGsolve_MDP_SAMPsolve_MDP_TDstart_vectorstate2featurestransformation_fourier_basistransformation_linear_basistransformation_polynomial_basistransformation_RBF_basistransition_graphtransition_matrixunreachable_statesv_approx_linearV_randomV_zerovalue_errorvalue_functionvisit_probability
Dependencies:clicodetoolscpp11crayonfastmapfloatforeachgluehmsigraphiteratorslatticelifecyclelpSolvemagrittrMatrixMatrixExtrapkgconfigprettyunitsprogressR6RcppRhpcBLASctlrlangvctrs
Last update: 2025-07-19
Started: 2024-05-31
Last update: 2025-06-18
Started: 2025-02-20
Last update: 2025-05-15
Started: 2024-09-01
Last update: 2025-02-20
Started: 2025-02-20
Readme and manuals
Help Manual
| Help page | Topics |
|---|---|
| Absorbing States | absorbing_states absorbing_states.MDP absorbing_states.MDPSample |
| Perform an Action | act act.MDPModel act.MDPSample |
| Choose an Action Given a Policy | action |
| Conversions for Action and State IDs and Labels | action_state_helpers features2state get_state_features normalize_action normalize_action_id normalize_action_label normalize_state normalize_state_features normalize_state_id normalize_state_label s state2features |
| Available Actions in a State | available_actions |
| Bellman Update and Bellman operator | bellman_operator bellman_update |
| Cliff Walking Gridworld MDP | Cliff_walking cliff_walking |
| Default Colors for Visualization | colors colors_continuous colors_discrete |
| Estimate the Convergence Horizon for an Infinite-Horizon MDP | convergence_horizon |
| The Dyna Maze | DynaMaze dynamaze |
| Find Reachable State Space from a Transition Model Function | find_reachable_states |
| Greedy Actions and Policies | greedy_action greedy_policy |
| Helper Functions for Gridworld MDPs | gridworld gw gw_animate gw_init gw_matrix gw_maze_MDP gw_path gw_plot gw_plot_transition_graph gw_random_maze gw_rc2s gw_read_maze gw_s2rc gw_transition_model gw_transition_model_end_state gw_transition_model_named gw_transition_model_sparse |
| Linear Function Approximation | approx_value linear_function_approximation pi_approx_linear q_approx_linear v_approx_linear |
| Steward Russell's 4x3 Maze Gridworld MDP | Maze maze |
| Define an MDP Problem | A is_converged_MDP is_solved_MDP MDP MDPModel P_ R_ S |
| Define an MDP With Only Sample Access | MDPSample |
| Extract, Create Add a Policy to a Model | add_policy induced_reward_matrix induced_transition_matrix manual_policy policy random_policy |
| Policy Evaluation | policy_evaluation policy_evaluation_bellman policy_evaluation_LP policy_evaluation_MC |
| Q-Values | Q_random Q_values Q_zero |
| Find Reachable States | reachable_states reachable_states.function reachable_states.MDPModel reachable_states.MDPSample |
| Regret of a Policy and Related Measures | action_discrepancy regret value_error |
| Calculate the Expected Reward of a Policy | reward reward.MDP |
| Round a stochastic vector or a row-stochastic matrix | round_stochastic |
| Sample Trajectories from an MDP | sample_MDP sample_MDP.MDP |
| Sample Trajectories from an MDPSample | sample_MDP.MDPSample |
| Schedules to Reduce Alpha, Epsilon and Other Parameters | schedule schedule_exp schedule_exp2 schedule_harmonic schedule_linear schedule_log |
| Solve an MDP Problem | solve_MDP solve_MDP.MDP solve_MDP.MDPSample |
| Solve MDPs with Temporal Differencing with Function Approximation | approx_greedy_action approx_greedy_policy approx_Q_value approx_V_plot solve_MDP_APPROX |
| Solve MDPs using Dynamic Programming | solve_MDP_DP |
| Solve MDPs using Linear Programming | solve_MDP_LP |
| Solve MDPs using Monte Carlo Control | solve_MDP_MC |
| Solve MDPs with Policy Gradient Methods | solve_MDP_PG |
| Solve MDPs using Random-Sampling | solve_MDP_SAMP |
| Solve MDPs using Tabular Temporal Differencing | solve_MDP_TD |
| Sample a Start State | start start.MDPModel start.MDPSample |
| Transformation Functions for Linear Function Approximation | create_basis_coefs transformation transformation_fourier_basis transformation_linear_basis transformation_polynomial_basis transformation_RBF_basis |
| Transition Graph | curve_multiple_directed plot_transition_graph transition_graph |
| Access to Parts of the Model Description | accessors normalize_MDP reward_matrix start_vector transition_matrix |
| Unreachable States | remove_unreachable_states unreachable_states |
| Value Function | plot_value_function value_function V_random V_zero |
| State Visit Probability | visit_probability |
| Windy Gridworld MDP Windy Gridworld MDP | Windy_gridworld windy_gridworld |
