Package: markovDP 0.99.0
markovDP: Infrastructure for Discrete-Time Markov Decision Processes (MDP)
Provides the infrastructure to work with Markov Decision Processes (MDPs) in R. The focus is on convenience in formulating MDPs, the support of sparse representations (using sparse matrices, lists and data.frames) and visualization of results. Some key components are implemented in C++ to speed up computation. Several popular solvers are implemented.
Authors:
markovDP_0.99.0.tar.gz
markovDP_0.99.0.zip(r-4.5)markovDP_0.99.0.zip(r-4.4)markovDP_0.99.0.zip(r-4.3)
markovDP_0.99.0.tgz(r-4.4-x86_64)markovDP_0.99.0.tgz(r-4.4-arm64)markovDP_0.99.0.tgz(r-4.3-x86_64)markovDP_0.99.0.tgz(r-4.3-arm64)
markovDP_0.99.0.tar.gz(r-4.5-noble)markovDP_0.99.0.tar.gz(r-4.4-noble)
markovDP_0.99.0.tgz(r-4.4-emscripten)markovDP_0.99.0.tgz(r-4.3-emscripten)
markovDP.pdf |markovDP.html✨
markovDP/json (API)
NEWS
# Install 'markovDP' in R: |
install.packages('markovDP', repos = c('https://mhahsler.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/mhahsler/markovdp/issues
- Cliff_walking - Cliff Walking Gridworld MDP
- DynaMaze - The Dyna Maze
- Maze - Steward Russell's 4x3 Maze Gridworld MDP
- Windy_gridworld - Windy Gridworld MDP Windy Gridworld MDP
control-theorymarkov-decision-processoptimizationcpp
Last updated 10 hours agofrom:73cf5e375c. Checks:1 OK, 8 ERROR. Indexed: yes.
Target | Result | Latest binary |
---|---|---|
Doc / Vignettes | OK | Jan 18 2025 |
R-4.5-win-x86_64 | ERROR | Jan 18 2025 |
R-4.5-linux-x86_64 | ERROR | Jan 18 2025 |
R-4.4-win-x86_64 | ERROR | Jan 18 2025 |
R-4.4-mac-x86_64 | ERROR | Jan 18 2025 |
R-4.4-mac-aarch64 | ERROR | Jan 18 2025 |
R-4.3-win-x86_64 | ERROR | Jan 18 2025 |
R-4.3-mac-x86_64 | ERROR | Jan 18 2025 |
R-4.3-mac-aarch64 | ERROR | Jan 18 2025 |
Exports:Aabsorbing_statesactactionaction_discrepancyadd_linear_approx_Q_functionadd_policyapprox_greedy_actionapprox_greedy_policyapprox_Q_valueavailable_actionsbellman_operatorbellman_updatecolors_continuouscolors_discreteconvergence_horizoncurve_multiple_directedfind_reachable_statesgreedy_actiongreedy_policygw_animategw_initgw_matrixgw_maze_MDPgw_pathgw_plotgw_plot_transition_graphgw_random_mazegw_rc2sgw_read_mazegw_s2rcgw_transition_probgw_transition_prob_end_stategw_transition_prob_namedgw_transition_prob_sparseinduced_reward_matrixinduced_transition_matrixis_converged_MDPis_solved_MDPmanual_policyMDPnormalize_MDPP_plot_transition_graphplot_value_functionpolicypolicy_evaluationpolicy_evaluation_LPQ_randomQ_valuesQ_zeroR_random_policyregretremove_unreachable_statesrewardreward_matrixround_stochasticSsample_MDPsolve_MDPsolve_MDP_DPsolve_MDP_LPsolve_MDP_MCsolve_MDP_SAMPsolve_MDP_samplingsolve_MDP_SGDsolve_MDP_TDsolve_MDP_TDNstart_vectortransition_graphtransition_matrixunreachable_statesV_randomV_zerovalue_errorvalue_functionvisit_probability
Dependencies:clicodetoolscpp11crayonfastmapfloatforeachgluehmsigraphiteratorslatticelifecyclelpSolvemagrittrMatrixMatrixExtrapkgconfigprettyunitsprogressR6RcppRhpcBLASctlrlangvctrs
Readme and manuals
Help Manual
Help page | Topics |
---|---|
Absorbing States | absorbing_states |
Perform an Action | act |
Choose an Action Given a Policy | action action.MDP |
Available Actions in a State | available_actions |
Bellman Update and Bellman operator | bellman_operator bellman_update |
Cliff Walking Gridworld MDP | Cliff_walking cliff_walking |
Default Colors for Visualization | colors colors_continuous colors_discrete |
Estimate the Convergence Horizon for an Infinite-Horizon MDP | convergence_horizon |
The Dyna Maze | DynaMaze dynamaze |
Find Reachable State Space from a Transition Model Function | find_reachable_states |
Greedy Actions and Policies | greedy_action greedy_policy |
Helper Functions for Gridworld MDPs | gridworld gw gw_animate gw_init gw_matrix gw_maze_MDP gw_path gw_plot gw_plot_transition_graph gw_random_maze gw_rc2s gw_read_maze gw_s2rc gw_transition_prob gw_transition_prob_end_state gw_transition_prob_named gw_transition_prob_sparse |
Steward Russell's 4x3 Maze Gridworld MDP | Maze maze |
Define an MDP Problem | A is_converged_MDP is_solved_MDP MDP P_ R_ S |
Extract, Create Add a Policy to a Model | add_policy induced_reward_matrix induced_transition_matrix manual_policy policy random_policy |
Policy Evaluation | policy_evaluation policy_evaluation_LP |
Q-Values | Q_random Q_values Q_zero |
Regret of a Policy and Related Measures | action_discrepancy regret value_error |
Calculate the Expected Reward of a Policy | reward reward.MDP |
Round a stochastic vector or a row-stochastic matrix | round_stochastic |
Sample Trajectories from an MDP | sample_MDP |
Solve an MDP Problem | solve_MDP |
Solve MDPs using Dynamic Programming | solve_MDP_DP |
Solve MDPs using Linear Programming | solve_MDP_LP |
Solve MDPs using Monte Carlo Control | solve_MDP_MC |
Solve MDPs using Random-Sampling | solve_MDP_SAMP |
Solve MDPs using Random-Sampling | solve_MDP_sampling |
Episodic Semi-gradient Sarsa with Linear Function Approximation | add_linear_approx_Q_function approx_greedy_action approx_greedy_policy approx_Q_value solve_MDP_SGD |
Solve MDPs using Temporal Differencing | solve_MDP_TD solve_MDP_TDN |
Transition Graph | curve_multiple_directed plot_transition_graph transition_graph |
Access to Parts of the Model Description | accessors normalize_MDP reward_matrix start_vector transition_matrix |
Unreachable States | remove_unreachable_states unreachable_states |
Value Function | plot_value_function value_function V_random V_zero |
State Visit Probability | visit_probability |
Windy Gridworld MDP Windy Gridworld MDP | Windy_gridworld windy_gridworld |