Reinforcement learning is a machine learning paradigm that focuses on how an agent can perform actions in an environment to achieve a certain goal. The agent learns through interaction with the environment, observing the state and making decisions to maximize its reward. Reinforcement learning has wide applications in intelligent control systems. However, one limitation of reinforcement learning is the uncertainty in handling the environment model. Usually, reinforcement learning is performed without a clear model, which requires estimating environmental uncertainty and state transitions. Bayesian Networks are effective in modeling uncertainty, which can aid in establishing a probabilistic model of environmental dynamics. This allows for the integration of uncertainty information into the environmental model, leading to a more accurate understanding of the dynamic characteristics of the environment. In this study, we propose a reinforcement learning algorithm based on Bayesian Networks. We utilize optimal generalized residual differentiation, parallel integration causal directional reasoning, and other modeling techniques to address reinforcement learning tasks. The main idea is to utilize the prior distribution to estimate the uncertainty of unknown parameters. Then, the obtained observation information is used to calculate the posterior distribution in order to acquire knowledge. Experiments demonstrate that this approach is feasible in intelligent control systems operating in uncertain environments.