Planning and Decision-making for Connected Autonomous Vehicles at Road Intersections: A Review

Planning and decision-making technology at intersections is a comprehensive research problem in intelligent transportation systems due to the uncertainties caused by a variety of traffic participants. As wireless communication advances, vehicle infrastructure integrated algorithms designed for intersection planning and decision-making have received increasing attention. In this paper, the recent studies on the planning and decision-making technologies at intersections are primarily overviewed. The general planning and decision-making approaches are presented, which include graph-based approach, prediction base approach, optimization-based approach and machine learning based approach. Since connected autonomous vehicles (CAVs) is the future direction for the automated driving area, we summarized the evolving planning and decision-making methods based on vehicle infrastructure cooperative technologies. Both four-way signalized and unsignalized intersection(s) are investigated under purely automated driving traffic and mixed traffic. The study benefit from current strategies, protocols, and simulation tools to help researchers identify the presented approaches’ challenges and determine the research gaps, and several remaining possible research problems that need to be solved in the future.


Introduction
Over the past decades, there are assive amounts of investments that have een put into elieving the traffic congestion problem around the world. However, road traffic congestion is still one of the major concerns of travelers. It is found that congestion can inflict high economic costs [1]. The cost in the United States is up to $124.2 billion in 2013 and is expected to increase to $186.2 billion in 2030 with traffic congestion forecast to increase by 60% [2]. In this paper, the study focused on addressing the methods of planning and decision-making methods at intersections. We believe that intersections could be one of the most critical traffic environments from the perspective of safety and mobility [3][4][5][6][7].
From the previous research works, intersections are important parts of traffic networks. The commonly used approach to manage the intersection is traffic signals control. However, the efficiency of the traditional traffic lights is relatively low when traffic volumes are high [8]. Currently, over 30,0000 traffic lights among the U.S. with 82.7 billion investments [9]. It is necessarily to operate the traffic signal efficiently to reduce the congestion and improvement the mobility at intersections. However, the majority of signals in the U.S. are pre-timed and updated every 2-5 years. The re-timing traffic signals has huge benefits. Ref. [10] discussed how this type of traffic engineering maintenance can significantly reduce delays and stops experienced by automobile drivers. The results showed that the benefit to cost ratio for signal re-timing is approximately 40:1.
Moreover, researchers found that most incidents occurred in intersections and caused significant traffic delays in urban area [4]. From reports published by National Highway Traffic Safety Administration (NHTSA), intersection crashes are among the most serious crash modes in the U.S. In 2015, crashes in intersections accounted for 1.3 million police-reported crashes and 5251 fatal crashes [11,12]. These crashes correspond to approximately 20% of all police reported crashes and 16% of all fatal crashes [13]. Therefore, it is important to develop technologies at intersections to allow vehicles to pass the intersection safely and efficiently.
With the autonomous vehicle technology and the wireless communication technology development, the field of the autonomous intersection management (AIM) system has attracted a lot of attention and shows lots of research and development efforts. AIM can coordinate the movement of autonomous vehicles through intersections. A typical complete AIM system is shown as Figure 1. There are several important components of the system: Vehicle on board unit (OBU), Roadside unit (RSU), Signal Controller, Traffic control unit (TCU), Traffic control center (TCC), Traffic operation center (TOC), and Cloud. For planning and decision-making purpose, the OBU should contain vehicle dynamic control and intersection approach/departure applications in intersection segment, the RSU should provide vehicle lane group control, vehicle driving reservation and planning to help vehicles cross the intersection. The signal controller helps control the traffic signal based on the data received. TCU/TCC/ TOC will coordinate to determine and execute the optimal intersection passing movement. The Cloud system could interact with other Ur ban Services/Applications. However, if we want autonomous vehicles to operate at intersection with high efficiency and safety, we must design an appropriate way to coordinate them carefully [14,15]. Previously, intersection management research separated into two branches, i.e., cooperative resource reservation and trajectory planning [16]. In cooperative resource reservation, the intersection space is separated into the time-space sequence that is assigned to vehicles and scheduled to achieve a safe crossing. The reservations can be managed by a centralized unit [17,18] or in a distributed manner [19,20]. As for trajectory-planning approaches, researchers focused on the movement of vehicles relative to a fixed point in the intersection (which resembles the on-ramp merging solutions) to achieve a safe crossing of vehicles [21][22][23][24]. Liu et al. proposed a cooperative trajectory planning strategy to pass through an intersection with V2X communication. The results showed that the proposed strategy has the smallest evacuation time among that in fixed lights and adaptive light [25].
As a key component of the automated intersection management system, decision-making and planning of vehicles becomes one of the major tasks to be solved [26]. For example, rule-based decision-making approaches are commonly used in autonomous driving. However, such rule-based approaches are reliable and easy to interpret. For simple driving scenarios, it can achieve a good performance on safety with hand-engineered rules. But when dealing with complex urban environments, such as road intersections, where various uncertainties exist, a rule-based approach cannot maintain safe and efficient driving [27]. Thus, some other approaches with more intelligence, such as optimization and learning-based methods, have been widely discussed in past decades. Since no uniform model can handle all scenarios, the planning and decision-making methods of vehicles in the complex urban driving environment is still challenging.
Refs. [28] and [29] uses optimization methods to generate paths and speed for the vehicles at intersections in the goal of reducing overall fuel consumption and increasing transportation efficiency, in fully connected and partially connected traffics. Where Ref. [30] also considered the quality of the generated path and introduced a heuristic algorithm to refine the trajectories of a team of connected vehicles. Refs. [31] and [32] implemented reinforcement learning to solve the problem considering safety, efficiency and task completion in connected and unconnected scenarios accordingly.
In this paper, we mainly focused on planning and decision-making technologies at road intersections, as shown in Figure 2. Firstly, we divide the general planning and decision-making technologies at intersections from the previous paper into several major approaches: graphbased approach, prediction-based, optimization-based approach, and machine learning based approach. Then, we summarized those technologies used in vehicle infrastructure cooperative environment. Both signalized and unsignalized intersection(s) are considered with purely automated driving traffic and mixed traffic. We believe the study learned from current strategies, protocols, and simulation tools help researchers identify the presented approaches' challenges and determine the open issues for future research.
The rest of this paper is organized as follows: In Section 2, we introduce the general planning and decisionmaking technologies at intersections. In Section 3, we show that the Vehicle-Infrastructure cooperation control method for intersections. In this section, we also discuss in detail about the impact of wireless communication technologies. Then, we compare existed state-of-art under mixed traffic environment, and their according performances are studied in follow Section 4. Furthermore, the previous approaches' challenges and determine the open issues for future research are discussed in Section 5. Finally, Section 6 concludes this paper.

General Planning and Decision-making Methods at Intersections
Planning and decision-making algorithms at intersections have been rapidly developed over the years. Which includes planning algorithms such as optimal trajectory generation within the topology occupation-free space, and decision-making methods which mainly focus on speed profile generation along the generated trajectory. These methods generally share the same major goal which is establish efficient, safe, and real-time behaviors of the ego vehicle. Overall, planning and decision-making algorithms of autonomous vehicles could be divided into several categories, which include graph-based approach, optimization-based approach and machine learningbased approach. These methods are explained as follows.

Graph Based Approach
Graph based method which is also widely used in mobile robotics could help vehicle find a path within the free space. At intersections, an explore searching method has been implemented, which is the rapid random tree search (RRT) method. The RRT method is first introduced by LaValle [33]. This algorithm could search a possible path under a very short period and could enable fast regeneration if the environment changes. The algorithm follows a growing tree based searching method, the tree growing method could be briefly concluded as, first, a point is randomly sampled in the free searching space, secondly, the newly sampled point and each of the tree nodes are connected, the distance is calculated from each of the tree nodes to the newly sampled point and a tree node is selected from certain heuristics, for example, the shortest distance. Then from the nearest point to the sampled point's direction, a line with a predefined length is extended from the nearest point to obtain a new point (let us named it point x). Lastly, if the line segment between the nearest point and point x are within the free space, this point x and the connect line is attached as part of the search tree. After the RRT method was proposed, many variants developed from this algorithm has been proposed. Some classic variations including the RRT* algorithm [34] and CL-RRT algorithm [35] which are two state-of-the-art variants in literature and are often been compared with as benchmarks. RRT* provides two additional steps compared with traditional RRT by adding a rewire process to the search tree growing process, and had shown that it could provide asymptotic optimal paths with enough searching time with proper modifications [36]. CL-RRT uses the closed-loop simulation while generating the tree to provide better feasibility of the generated paths. The algorithm first generate tree search paths, then instead of using straight lines to connect the nodes, it uses a simple controller to generate the path from the father node to the new node. This method enables better control feasibility as kinematic has been considered during the path generation process. With the emerging development of the RRT family, RRT-based algorithms had shown to satisfying performances and are widely applied in autonomous driving including intersection scenarios.
Ma et al. [37] implemented a variation of the RRT algorithms family to generate paths at an intersection. It makes uses of the road geometry in the first place and pre-generates splines accordingly as the base tree. Using the spline as the initial tree, more branches are generated so that a collision free path could be found. By regenerating the search tree repeatedly at a high frequency, obstacle avoidance could be performed. Chen et al. [38] formed the problem in to a hierarchical planning problem. The major method could be divided into three layers, high-level route planning layer, mid-level task planning layer and the low-level motion planning layer. The high-level route planning layer first generates several way-points given the road geometry. Then, the task planner generates tasks accordingly to the received way-points. Finally, the low-level motion planner generates real-time control commands for each tasks as intersections. Yoon et al. [39] introduced a spline-based RRT* algorithm which combined cubic Bézier polynomial curves and RRT* search method to achieve high efficient collision avoidance at intersections under continuous search space. Mehta et al. [40] introduced a human interface into the RRT algorithm which enables an additional experienced drivers knowledge that could assist the process of re-planning.
However, RRT algorithms are mostly implemented in maze and other static environments, with the complexity and fast-changing nature of intersections leads to conservative decisions and the refresh rate of the planner may be challenging in some critical corner case scenarios.

Prediction-Based Approach
Only considers and plans on the current state could be dangerous at times and might be insufficient to guarantee full-safety. Thus, planning considering the predicted future path of the environment vehicles is one of the very popular ways to solve intersection planning problems, crediting from the existing plentiful motion prediction and risk assessment algorithms [41][42][43][44][45][46][47][48]. Huang et al. [44] trained a variational neural network using multiple sensors to predict the conditional variational distribution of the possible future states of the ego vehicle.
The proposed algorithm is able to cooperate with various uncertainties and also are able to coordinate between different sensors' information, by combining the physicsbased model to the system, it can increase the prediction accuracy and other performances of the prediction process. Ref. [43] introduced an RNN-based method to predict the surrounding vehicle's intentions using Lidartracking information. The recurrent neural network was trained by a dataset containing of unsignalized intersection naturalistic driving behaviors. The naturalistic driving dataset was also used to evaluate the performance of the prediction results. Using this method, the predictor was able to predict severe potential collisions for more than a second in advance. Jeong et al. [49] used support vector machine to predict the intentions of environment vehicles, given the predicted states, the vehicle could generate and perform human-like planning behaviors at unprotected intersections.
Jeong et al. [50] trained a long short-term memorybased RNN to predict the future states of the surrounding vehicles, with the prediction results, the planner was able to increase safety. Wang et al. [51] operates the precision process using the probabilistic occupancy to predict the motions of vehicles with Monte Carlo simulation, the prediction model was preset and which is obtained from off-line training, then it is used for on-line planning to achieve good real-time performance of planning result. Ref. [52] also introduced a LSTM model that combines intention estimation and trajectory prediction, the framework first trained a high-level intention estimator to predict the rough path of the vehicle at intersections then it performs a more detailed trajectory prediction at intersections. For the prediction and planning framework, if the prediction result is sound and accurate, this type of methods could provide sufficient safety performance and could be implemented under different scenarios. Therefore, developing and applying these path prediction and risk assessment algorithms, the result of the intersection planning performance could be improved greatly.

Optimization Based Approach
Another popular approach for planning at intersections is by modeling the planning problem into a real-time optimization problem, by properly study and set up the cost function and boundaries as well as constrains, planning and decision-making could be operated in a heuristic manner. One major type of optimization method used for autonomous driving at in intersections is the model predictive control method. The key components in a model predictive control framework include an obstacle model, a vehicle model and a proper optimization solving method.
The vehicle model includes the point-mass [53] model which takes the vehicle as a single point with a certain mass, which does not consider the vehicle's shape or kinematic [54] or dynamic [55]. The kinematics model on the other hand, takes in the vehicle geometry into consideration which is often used in path planning, one classical type of formulation of the kinematic formulation are knows as the bicycle modeling. The dynamic model is established considering the dynamic characters of the vehicle, which could include tire model, vehicle body dynamic model which are nonlinear, and are often used in vehicle control problems and highway scenarios with large or rapid change lateral forces. The obstacle model which are established and set for obstacle avoidance includes 2 types of classical approaches. One of them is through setting a virtual repulsive force as part of the reward function to add forces to keep a safe distance from the obstacle to the ego vehicle, this type of formulation are also known as the potential field method. A variable for additional slackness could be added to the constraints of the optimization process, which are able for more acceptable feasibility. However, sometime this kind of modifications could lead to nonconvex and nonlinear cost functions, and proper linearization or approximation for the optimization process is necessary. One of the other approach is through properly setting the constrains which allow the ego vehicle to operate within the obstacle-free space, but in reality, the constrains obtained from the real-world are most likely nonconvex. Solving those nonconvex problems could lead to very high computation power requirements, therefore, the model is often modified to a linear programming (LP) problem, quadratic programming (QP) problems [56], etc. or dynamic programming problem. After the model has been properly established and set to a typical optimization problem, then this problem is solved by using the corresponding optimization solver.
Optimal control method such as model predictive control (MPC) has been broadly developed and used in solving planning and decision-making problems for autonomous vehicles at intersections. Authors in Ref.
[57] designed a Robust Model Predictive Control planning algorithm to guarantee sufficient safe gaps at critical intersection scenarios. This method improves the driving efficiency and driving comfort and also refines the model with "Affine Disturbance Feedback". Ref. [51] introduced a hierarchical method for planning at intersections, it first used road geometry to generate some reference initial trajectories, then model predictive control was used to optimize the initial trajectories and generated decisions and the corresponding motions along the refined path. During this process, the constrains and optimization boundaries of the optimization problem was easy to be implemented through coordinate transformation. For the final level, probabilistic-occupancy grid-based predictions of environment vehicles which were obtained before-hand to enable better real-time performances. The simulation results have showed that there exists good improvement of the planned path and speed, which also demonstrated the efficiency of the proposed approach.
Hult et al. [58] presented a bi-level (coordination arrangement level and vehicle motion planning level) MPC for intersection planning. Which coordinated occupancy time slots before the vehicle enters the intersection and provides ego vehicle direct control inputs separately. This approach had shown solid improvements on the feasibility and stability of the planning performance at the intersections with certain assumptions. Nilsson et al. [59] used two loosely coupled MPCs to enable better planning performance, the idea of decoupling of the longitudinal and lateral of the vehicle control is introduced and the two MPCs are assigned to each direction accordingly. Liu et al. [60] introduced a MPC model that could automatically select appropriate parameters for various types of maneuvers to increase the generality usage of the planner. To obtain better driving comfort, a special lane-associated potential field was implemented. Also, the environment vehicles are seen as polygons which are modified to be converted into hard constraints in the MPC problem to ensure safety. Ref. [61] generates speeds of the vehicle within a fixed generated path from a set of waypoints by using temporal optimization. However, the problem turns out to be nonconvex. To solve this problem, the MPC problem was simplified by a set of quadratic programs and were solved by the slack convex feasible set (SCFS) algorithm to improve the real-time performance of the proposed method. Ref. [17] formed the two-dimensional intersection steering and acceleration planning problem into an optimization problem, the motion dynamics of ego vehicle were considered in a distance scale instead of a time scale. The optimization problem was solved based on Minimum Principle of Portraying.
Other optimal control methods such as the Bezier curve optimization method is implemented for intersection planning. Ref. [62] considers constrains such as collision avoidance and formulates the planning problem as an optimization problem under equality constrains. Then, the optimization is solved using Quadratic programming (QP) and Hildreth's algorithm. Ref. [63] assumed that the ego vehicle is able to detect all surrounding obstacles. With this assumption, it introduced a reactive and adaptive path planner and also formalized the problem as a Bezier curve optimization problem with certain constrains and solved it using Lagrangian and gradient-based methods.

Machine Learning Based Approach
Machine learning (ML) approach has been a boosting approach due to the prosperous development of the artificial intelligence and the wider application of the computer science. By using these ML approaches, the ego vehicles is able to implement more human-like driving behaviors by studying from naturalistic driving data or from driving expert experiences. Using those methods, the ego vehicle is able to cooperate under complex scenarios.

End-to-End Neural Network Approach
One common approach for the neural network (NN) approach is the end-to-end approach, that is, by feeding in the images and sensor information of expert drivers driving at intersections and the actual actions of the drivers at such scenarios, a NN network is trained so that when at a similar scenario, giving the original input from the sensors, the model will be able to directly provide actions as output. This mean that the logic in the decision-making from perception to action is embedded in the NNs. Ref. [64] uses end-to-end methods as the core of its proposed algorithm, however, they used 360-degree camera instead of only using front camera as classical end-to-end planners would do. The trained network has also been carefully modified to adapt to the new censoring format. Also, a route planner is added to help the driving task significantly. Which as result, this proposed method could decrease the chances of collision at intersections. Instead of only having camera input for classical end-to-end methods, Ref. [65] took LIDAR and HD maps as inputs which enables a 3D interpretation, the paper put the newly proposed method into real-world practice, and the learned cost volume could help the ego vehicle perform more safely at intersections, and the vehicle is able to handle traffic light while interacting with the surrounding traffic participants.

Reinforcement Learning Based Approach
Reinforcement learning (RL) is also a common approach for intersection decision-making and planning for autonomous vehicles, for classical RL problems the reward functions are set up using the expert driving knowledge, and shall be fine-tuned for better performance. However, with the development of the NN networks, these approaches are commonly combined with neural networks to better learn from human expert behaviors which enables better feasibility. Bouton et al. [66] introduced a safe RL algorithm using a method called modelchecker to improve safety, they also used a learning based belief update method which allows the applied method to better coop with occlusions and sensor-obtained errors. Ref. [67] enabled general decision-making of intersection vehicles at occluded multi-lane intersections. The reward function combines risk assessment and the NN-based Q network is used to estimate the possible risk, which enabled the planner to enable less cation actions but at the same time ensures safety. Ref. [68] studied typical behaviors of other vehicles which enable the ego vehicle to better coop with environmental interactive vehicles at intersections. The speed profile of the surrounding vehicles and the dynamic distance from the environment vehicles to the ego vehicle was evaluated and are later on considered in the model to perform different types of driving behaviors. This algorithm proposed enables the ego vehicle to cross the intersection safety since it avoid collision almost in most of the operation period, this performance has outperformed many state-of-theart models at that time. Ref. [69] introduced an intersection planner and decision maker that considered the operational-level control of the vehicle at intersections. The planner takes the hard dynamic and kinematic constrains of the ego vehicle into consideration in the deep RL model, this increases the feasibility of the generallevel planning outcomes in the real-world intersection planning scenarios, and the feasibility of using the algorithm in applications was verified by simulations. Isele et al. [70] combined reinforcement learning and NN to learn active sensing behaviors to achieve safety in a congestion intersection environment. Qiao et al. [71] learned the policies of intersection scenarios with automatically generated curriculum (AGC) to increase the learning process. The performance of the deep reinforcement learning approach is closely related to the training data, and it is a crucial thing to obtain accurate, complete, and sound dataset to let the model learn all the possible scenarios which leads to safer and better planning and decision-making outcome.

Partially Observable Markov Decision-Making Process Approach
In a Markov decision-making process (MDP) model, all the states, transfer functions, reward functions, etc. are considered to be static. However, in the real-world application and modeling, the states of the participants and environment are not always fully observable. Thus, to represent this uncertainty of the real world, a partially observable Markov decision-making process (POMDP) based approach has been widely applied and is becoming an emerging technique for highly interactive intersection planning problems for self-driving vehicles. POMDP is a decision-making framework for the agent in an uncertain scenario. The model is able to merge the uncertainties formulation into its own framework, for decision-making and planning of self-driving vehicles, these uncertainties could be sensors perception noise, surrounding drivers' intention uncertainties, future paths uncertainties and control output uncertainties. The agent would receive an observation from the reward after a decision had been generated at each time step [72]. The major approach for solving the POMDP problems could be categorized into two types, one of them is the exact solution methods and another one is the approximate solution methods. For real-world intersections planning problems which is very complicated, an exact solution is often impossible to find in real time, therefore, it is common to obtains approximate results as the solution.
Bouton et al. [73] navigated the ego vehicle at intersection scenarios where frequent changes usually occurred by modeling the problem into a POMDP problem and solved the problem using Monte Carlo tree search (MCTS) method, the presented results proved that the planner outperform the threshold-based heuristic method in both safety and efficiency. Ref. [74] also formulated the intersection planning and decision-making problem into a POMDP problem, the paper modeled the unknown intention of the surrounding vehicles with a probabilistic model and merged that into the observation model of the POMDP model. By solving the POMDP problem, the ego vehicle was able to make decisions in a continuous space on a fixed path at intersections in an online manner. Extend from Ref. [74], Ref. [75] proposed an intersection planning and decision model that the model is able to coop with multiple surrounding vehicles and was capable to outperform the base-line method in computation time, the proposed POMDP model was also able to operate in a continuous state space. Due to the curse of computational complexity, planning methods using POMDP methods often plans the vehicle's speed on a fixed path. However, this indicates that the intersection is not fully explored, Refs. [76,77] proposed a critical turning point (CTP) concept that enables the ego vehicle to generate several candidate paths to select and follow. which the candidate paths are generated through quintic polynomials. Given the candidate paths, the ego vehicle is able to select a candidate path and then plan speed along the selected path based on the beliefs of the uncertain intentions of the oncoming vehicle. Therefore, the intersection space is better explored under a limited amount of computational power.
Ref. [78] considered occlusion caused by both dynamic and static objects, and represented the movements of the surrounding vehicles in the observation filed (OF) and the phantom vehicles as reachable sets. This enabled the ego vehicle to operate in urban intersections and perform less conservative yet safe actions compared to the traditional method. Brechtel et al. [79] considered several kinds of uncertainties and predicted the results from different actions due to various assumptions in the proposed algorithm. The authors did not applied symbolic representation to represent the state-space for all the possible cases, instead, they used a continuous solver that represents specific situations. Ref. [27] developed decision-making at generalized congested intersections that avoided both static and dynamic obstacles. The objects in the observable area were being predicted and the objects in the occluded area were modified with more attention to avoid dangerous collisions. The proposed algorithm was able to generate human-like behaviors at uncontrolled intersections.

Cooperation Methods in Signalized Intersection
Research on optimization algorithms on traffic conflict at intersections began at the end of the last century. Since the concept of CAV has not yet emerged at that time, the early research on traffic optimization algorithms focused on the study of signal timing, i.e., optimization on the traffic signal phase and timing (SPAT).
In 1998, Schutter used optimal control to minimize the queuing length [80]. At the beginning of this century, the second wave of artificial intelligence also affected the field of transportation management. Bingham used a fuzzy neural network method to optimize SPAT [81]. Wiering [82] and Abdulhai [83] introduced reinforcement learning into the field of SPAT optimization. They innovatively regarded the traffic lights in continuous intersections as multiple agents. Through the communication equipment, i.e., V2X technology as shown in Figure 3, the central coordinator collects the position and velocity of the CAVs. Estimation on the waiting time of the vehicles at the intersection is further accomplished to be used in SPAT optimization. In the optimization process, minimizing the waiting time of all CAVs is taken as the optimization target. Q-learning method is applied to optimize the SPAT of multiple intersection traffic lights. Duan extended the previous method, adopting different optimization goals in different traffic flows, and collaboratively optimizing the timing of multiple signal lights [84].
With the merging development of automated vehicle technology, researchers have gradually taken CAVs into consideration for traffic optimization at intersections rather than merely collecting the CAVs' information.
The most commonly-known research in this field is the predictive cruise control algorithm proposed by Asadi [85]. First, through V2I technology, the CAV obtains the SPAT information of several consecutive traffic lights ahead. The time window of the continuous traffic green lights provides hard constraints to CAV control. Then, combined with the vehicle model, model predictive control (MPC) is applied to control the CAV to drive through several consecutive traffic lights with no idling at the intersections. Through continuous velocity trajectory planning, the overall travel time of the CAVs is significantly reduced. In this algorithm, CAVs are guaranteed to pass multiple intersections without stopping, which is also named as green wave traveling in later research. Xu proposed a hierarchical cooperation optimization method for the traffic light and CAVs [86]. In this study, considering the complex coupling relationship between signal light time allocation and CAV driving strategy optimization, Xu decoupled the problem into the upperlevel traffic efficiency optimization problem and the lower-level CAV control problem as shown in Figure 4. In the proposed hierarchical control method, the upperlevel controller optimizes the overall traffic flow and provides each CAV a target arrival time at the intersection. The lower-level controller optimizes the fuel consumption of CAV based according to the received arrival time, i.e., a terminal time constraint in the optimization.

Cooperation Methods in Unsignalized Intersection
Since CAVs obtain traffic light SPAT in advance, it is unnecessary to keep the traffic light if all vehicles are CAV. In approaching the intersection, each CAV is capable of scheduling the arrival time and optimize its speed trajectory. However, the routes of incoming CAVs have complicated conflict relationships at intersections. Therefore, obtaining a high-efficiency conflict-free arrival plan is a priority in CAV scheduling. Researchers have also pointed out that the passing order of the vehicles is the key factor influencing traffic mobility at intersections [87,88]. As incoming CAVs approach the intersection from different directions, the optimal passing order changes constantly and the solution space increases exponentially, preventing the determination of the optimal passing order using the brute-force method. This problem is typically solved using a reservation-based method [89]. The most straightforward reservation-based solution is the first-in-first-out (FIFO) strategy, in which vehicles that enter the intersection first are scheduled to leave it first [14,86,90]. There exist plenty variants of the basic FIFO strategy. In Ref. [91], the FIFO strategy is extended into platoon based scenarios, i.e., the scheduling unit is a CAV platoon rather than single CAV. In Ref. [92], the scheduling is made according to CAVs' estimating arrival time at the intersection. In Ref. [93], the queuing theory is introduced to dissipate the accumulating CAVs.
However, the FIFO passing order is not likely to be the optimal solution in most cases. Batch-based strategy is an improved version of the FIFO strategy, and it processes the vehicles in batches according to their direction to reduce the traffic delay [18]; however, its performance has not been fully optimized. Another widely used strategy is the optimization-based method. After formulating the scheduling problem as an optimization problem, various methods have been proposed to find the optimal passing order. Ref. [74] POMDP to formulate the problem, and used the adaptive belief tree algorithm to find the optimal passing order. Ref. [94] proposed a similar method to create a tree representing all possible solutions of the pass order, where the leaves of the tree stand for the complete scheduling plan. Guler [95] proposed an iterative algorithm to find the optimal passing order. In Ref. [96], they extended their work and proposed two central problems of intersection management: (1) Find the optimal arrival/departure sequence of CAVs, (2) Once they know the arrival/departure time, CAVs can obtain their optimized speed trajectory through various control methods, e.g., optimal control. In their research, they used the branch and bound method to find the best arrival/departure sequence. Guney [97] used particle swarm optimization (PSO), and Lu [98] used mix integer linear planning (MILP) to solve similar problems. Liu [99] transformed the centralized optimization problem into a distributed optimization problem, and solved the problem locally on each vehicle to find the best solution. In Ref. [100], a queue-based method was introduced to find the optimal solution that leads to the smallest average delay. Similar ideas are also applied in Ref. [101].
Apart from these specific methods, some researchers have also noticed that finding the optimal passing order of multiple CAVs is a discrete problem rather than a continuous problem, and thus graph theory is considered a promising methodology to describe and solve this problem. Studies have modeled the problem using graph theory by employing the conflict graph analysis to optimize the traffic-signal phase plan [102,103]. However, the scheduling of multiple CAVs is more difficult than scheduling a limited number of traffic signal phases because the number of incoming CAVs is much larger than the number of traffic signal phases. Some methods have been proposed to address this difficulty. Apart from the depth first spanning tree (DFST) algorithm proposed in Ref. [6], Petri net [104] and conflict duration graph [105] are also used in modeling the scheduling problem. However, the discussion on the optimality of the scheduling problem of multiple CAVs is inadequate.
To tackle the problem of optimality, mathematical modeling, i.e., the computational reduction of the intersection scheduling problem, is an essential research aspect. Some researchers have focused on reducing this problem into typical algorithmic problems, including the job-shop scheduling problem [106], abstraction-based verification problem [107], and polling system problem [108]. In Refs. [109,110], the problem is reduced to minimum clique cover problem, i.e., a classical NP-hard graph theory problem and a heuristic method is proposed to find the optimal passing order with high efficiency.

CAV Control Methods in the Cooperation
CAV control involves two aspects: longitudinal control and lateral control. In most of the previous research, lateral control is neglected by assuming the CAVs are running on their target lanes. Longitudinal control is the fundamental control problem in intersection management.
Considering a fully autonomous intersection scenario, a hierarchical framework is frequently used to realize the CAV coordination [86]. First, after collecting the information of the CAVs in real-time, the centralized controller schedules the arrival time of the CAVs to improve the traffic efficiency. Then, a distributed controller is applied to the CAV to implement the determined arrival plan. Multiple methods have been proposed to address the vehicle control problem. Milanés proposed a fuzzy logic method to optimize the speed trajectories of two CAVs [111]. Onieva used the fuzzy logic method to analyze the relative driving conditions of a CAV and a conventional vehicle. Then genetic algorithm is used to optimize the control parameter of the vehicle [112].
Malikopoulos claimed that at multiple intersections, a distributed optimal control method is suitable to minimize fuel consumption and maximize intersection flow [113]. In this paper, the definition of the CAV information set is given, and it is proved that whether the flow maximization problem has a feasible solution is only related to the hard constraints of the CAVs. They further proposed the application of the distributed optimal control method in multiple intersections [114], and provided the proof of the feasible solution conditions of the method [115]. Moreover, the virtual platoon method was proposed in Ref. [6], which projects the CAVs from different lanes into a virtual lane. Thus, the typical vehicle platoon analysis methods [116] can be used in the control problem of the CAVs on different lanes as if they are traveling in the same lane.
Energy consumption is the main focus of intersection cooperation. Eco-approaching focuses on the CAV approaching process, which is the main focus on most of the research. Avoiding queuing [117,118] and green wave driving [119,120] are two common research topics. Human driving error [121] and mixed traffic [122] are also discussed in some research to further improve the fuel economy. Eco-departure is another types of research which focus on the CAV leaving process. Ref. [123] optimizes the departure process of a CAV platoon to minimize the energy consumption.
In the aforementioned studies on intersections, the CAVs were assumed to run in their target lanes, i.e., only their longitudinal control was considered. In practice, however, CAVs approach from random lanes and have different target lanes; therefore, it is necessary to extend this research to scenarios that permit lane changing. Earlier studies on this topic focused on obtaining a smooth CAV speed trajectory [124,125]; however, traffic efficiency was not investigated completely. With regard to CAV scheduling, which we are concerned with, a few studies formulated this problem as an MIP [126] or a linear programming problem [127]. This was done by assuming that lane changing maneuvers are accomplished in a given time interval. Ref. [128] proposed a practical bi-level framework, where the highefficiency arrival plans and collision-free path planning are solved on the upper and lower levels separately. Ref. [129] proposed a practical method to decouple the longitudinal control and lateral control of the CAVs. Several other prospective studies also focused on changing lane directions dynamically rather than allocating CAVs to constant directional lanes [130][131][132]. However, flexible lane directions are unsuitable for a mixed traffic environment, where human-driven vehicles (HDVs) and CAVs coexist.

Planning and Decision-making Method under Mixed Traffic
For real-world implementation of CAVs, essential requirements include equipment of connected devices, e.g., being equipped with vehicle-to-vehicle or vehicleto-infrastructure equipment [133], and capabilities of autonomous control, i.e., being directly operated by industrial personal computers or microcontrollers [134]. According to the Department of Transportation of US, it will take 20-25 years to realize 95% MPR of connected vehicles [135], and based on the estimation of Ministry of Industry and Information Technology of China, the MPR of partial/conditional autonomous vehicles is expected to reach 25% by 2025 [136]. Considering the long-term deployment of CAV technologies, research on mixed traffic environment of CAVs and HDVs is of great importance to improve traffic mobility at intersections in the near future.

Traffic State Estimation under Mixed Traffic
The aforementioned research mainly considered a fullyautonomous scenario-the market penetration rate (MPR) of CAVs is 100%. In practice, however, it might take decades for all the HDVs in current transportation systems to be transformed into CAVs. Instead, a more practical scenario in the near future is a mixed traffic system where CAVs and HDVs coexist [137]. Existing research on mixed traffic intersections mostly focused on the estimation of traffic states and optimization of traffic signals. For example, Priemer et al. utilized dynamic programming to estimate the queuing length in the mixed traffic environment and then optimized the traffic phase time based on the estimated results [138]. Feng proposed hierarchical architecture to allocate the traffic SPAT [139]. Estimation of location and speed (EVLS) method was built to segment the incoming road. Afterwards, the HDVs' driving information is estimated based on the segmentation. Using fuzzy logic, the traffic condition on the environment was also estimated in Ref. [140]. Ref. [9] proposed a method to combine the vehicle arrival time and expectation maximization to estimate the traffic flow. The fundamental idea of the research is to estimate the HDVs' trajectory and traffic flow through CAVs' trajectory. To improve the estimation accuracy, the celebrated results on microscopic car-following models have been recently employed to describe the behaviors of HDVs, including Gipps Model [141], Optimal Velocity Model (OVM) [142] and Intelligent Driver Model (IDM) [143]; see, e.g., Refs. [144,145].

CAV Control under Mixed Traffic
Despite these existing works, the topic of CAV control, i.e., trajectory optimization of CAVs, in mixed traffic intersections has not been fully discussed. To tackle this problem, several works regarded HDVs as disturbances in the control of CAVs [118,144], or focused on the task of collision avoidance based on the prediction of HDVs' behaviors [30,122]. It is worth noting that most of these methods were limited to improving the performance of CAVs themselves in their optimization frameworks, instead of optimizing the global traffic flow consisting of both HDVs and CAVs at the intersection. Two notable exceptions are in Refs. [29,146], which attempted to improve the performance at the signalized intersection from the perspective of the so-called mixed platoon. They enumerated several possible formations consisting of HDVs and CAVs, and investigated their effectiveness through small-scale simulation experiments. Ref. [147] clarify a general and explicit definition of the "1 + n" mixed platoon as shown in Figure 5. A concise conclusion is made that the stability and controllability of the proposed platoon structure is irrelevant to the platoon size.
In fact, considering the interaction between adjacent vehicles in the same lane, it is easy to understand that velocity trajectories of CAVs could have a certain influence on those of surrounding vehicles, especially the vehicles following behind them. Accordingly, the driving strategies of CAVs could have a direct impact on the performance of the entire mixed traffic intersection; such impact might even be negative when inappropriate CAV strategies are adopted [118]. By contrast, when taking the performance of entire mixed traffic into explicit consideration, the optimization of CAVs' trajectories could bring further benefits to traffic mobility. Such idea of improving the global traffic performance via controlling CAVs has been recently proposed as Lagrangian Control of traffic flow [148], which has been discussed in various traffic scenarios, including closed ring road [137], open straight road [149], traffic bottleneck [150], and non-signalized intersection [151].
The vehicle trajectory planning problem in mixed traffic environment is similar to that of the 100% MPR conditions, which is interpreted in Section III-C. The difference of vehicle trajectory planning in mixed traffic environment is that the HDV behavior has influence the CAV trajectory planning result. Typical methods to solve the problem include shooting heuristic method [30], preceding horizon method [152] and event-triggered algorithm [147].

Discussion and Potential Research Direction
Based on the planning and decision-making technologies mentioned above, we summarize the current open issues and discuss several critical future research directions on planning and decision-making at intersections in this section.

Penetration Rate of CAVs and Level of Automation
The penetration rate of CAVs shows significant influence to the performance of the above planning and decisionmaking methods [72,122,153]. Higher penetration rate of CAVs usually means better performance in many methods. In fact, many existed studies assumed 100% penetration rate of CAVs with fully automation to reduce the complexity of the planning and decision-making model. Full information of all vehicles can be used and/or all vehicles can be controlled to better design the intersection planning and decision-making methods, this guarantees conflict-free vehicle cooperation [86,110]. However, we all know that the 100% penetration rate of CAVs and fully automated vehicles (i.e., level 5) need much more time than we expected, despite the grows of the number of CAVs and the development of CAVs' technologies. Therefore, it is important to analysis the performance of the proposed planning and decision-making models at intersections under mixed traffic scenarios. It will make the method more piratical and convincing if it could handle traffic with both human-driven vehicles and CAVs, with and without connectives, and with different levels of automation [154,155].
Moreover, some studies demonstrated that there existed a significant changing point of the performance along with the increasing of the penetration rate of CAVs. Niroumand et al. developed a program to program to control the trajectory of mixed connected-automated vehicles (CAVs) and connected human-driven vehicles (CHVs) through signalized intersection. The results indicate that the proposed program will become the dominant signal indication after the CAV market penetration rate of 80% [156].
Although there is several remarkable research related to the impact of the penetration rate of CAVs, there are still many problems need to be explored. As a good starting point, it is urgent to conduct more complex intersection planning and decision-making scenarios with different penetration rate of CAVs. Meanwhile, previous studies usually adopted simulation-based model to study the impact of the penetration rate. Developing theoretical models or theoretical simulation combined models can better verify the impact of penetration on intersection planning and decision-making performances.

Reliability of Communication
The rapid development of the V2X technology provides an opportunity to improve traffic mobility and safety in intersection management [133]. Through vehicle-tovehicle (V2V) and vehicle-to-infrastructure (V2I) communication, a centralized controller is deployed at the intersection to coordinate the connected and automated vehicles (CAVs) to pass through the intersection. Same as the penetration rate of CAVs, many current studies assumed perfect communication environment to support the planning and decision-making method [157,158]. However, communication issues, such as latency [159], package loss [160], band-width [161,162], is highly affect the model performance at intersections. It is hard for a traffic engineer to understand what makes these issues occur. However, more efforts to develop a method with robustness to these issues need to be conducted. Liu et al. considered vehicle coordination at signal-free intersections with latency [163].
Although these factors to impact the reliability of communication has been studied by the networking community and the control community, we still lack a connection between communication performance and intersection level planning and decision-making performance.

Driver Behavior of CAVs at Intersections
Significant amount of research has been done on the driver behavior under homogeneous traffic conditions at intersections [164], however little or no research has been found on mixed traffic conditions, where vehicles do vary in physical and dynamic characteristics [165]. As for mixed traffic with CAVs and human-driven vehicles, fewer studies related to the area.
Since there are few CAVs in real-world, researchers mainly using simulation approach to analysis the driver behavior of CAVs under mixed traffic at intersections [166][167][168]. R Arvin et al., evaluated the safety performance of CAVs at intersections. A simulation framework was developed, and Adaptive Cruise Control (ACC) and cooperative CACC car following models are used to mimic the behavior of CAVs [169].
The nature of human planning and decision-making at intersections has significant research value. A welldeveloped driver behavior is robust to environmental uncertainties and can well capture the real-world behavior of human drivers. By adapting the driver behavior model, vehicles with more accurate predictions of human actions can be provided when operating in mixed traffic environments.

Multi Intersections Coordination and Optimization
The current planning and decision-making technologies using CAV mainly emphasized on isolated intersection. Few of them discussed coordination and optimization between two or more intersections. The main challenge of Multi intersections planning and decision-making is how to use CAVs and other communication devices to coordinate multiple intersections, and how to effectively solve the resulting large-scale problems. These issues require further specialized investigation. Bui et al. focused on optimizing traffic flow at multiple intersections using Cooperative game-theoretic approach [170]. Zhang et al. coordinated control of traffic signals between intersections using adaptive genetic algorithm based on cloud computing [171].
Analyzing the planning and decision level coordination and optimization for multi intersections still needs more efforts to conducted. Finding global optimal solution with avoid falling into a local optimum and has fast solution efficiency can significant improve the mobility of the transportation system.

Multi-objective Multi-agent Planning and Decision-making
The traffic condition at intersections very complicated. The geometry of the intersection is often inconsistent and participants are various. Therefore, the planning and decision-making objectives for each traffic participants at intersections are usually different. Current algorithms for optimizing traffic operations at an isolated intersection consider mostly one objective function [19,66,170]. For example, vehicle delay is main factor to impact traffic mobility. The approach to optimize vehicle delay can lead to long pedestrian waiting times as well as reducing safety conditions at the intersection. We know that some optimization goals cannot be achieved at the same time, and some are even contradictory. How to balance or adjust the final collaborative optimization according to different purposes is one of the research directions. Ref. [172] presented a multi-criteria optimization approach to the isolated intersection. The trade-off algorithm between vehicle and pedestrian delay is developed. The proposed approach is tested and showed that by worsening vehicle delays, pedestrian delays can be improved.
Ref. [173] developed a multi-objectives model that considered including guaranteeing CAV safety, alleviating traffic congestion, and improving the performance of fuel consumption.
With the development of future research, at least two different intelligent bodies such as roads and vehicles will appear in the entire autonomous transportation system. How to coordinate and solve the planning decisionmaking problems between different agents will affect road safety and the efficiency of the entire transportation system. At present, there are two solutions for intelligent decision-making technology in autonomous driving technology: one is a rule-based [174], and the other is a solution that is constantly iterated through artificial intelligence [175]. Among them, the rule-based solution requires manual construction of a very complex rule structure, which can happen; while artificial intelligence uses neural networks, confrontation networks and other" black box" processing solutions, the driving behavior of autonomous vehicle drivers is closer to human driving behavior.

Real World Testing and Implementations
Real world testing and implementations for planning and decision-making under CAVs at intersections is another critical research direction in the area. There are thousands of papers related to planning and decision-making methods since 2000's and the majority of them are simulation-based [26,176,177]. In the early stages of research in this field, it is acceptable to avoid using real data. Because in the beginning, it was just to prove that some algorithms are feasible for planning and decision-making at intersections. However, along with the development of different planning and decision-making methods, the data itself plays more and more important roles in problem formulating, model building, and algorithm developing. In simulation environment, we could simplify many issues caused by traffic data, such as observation errors, data lost, and data noise. Meanwhile, for traffic state estimation using real-world observed data, observation errors are inevitable when applied to real traffic systems [178]. Sometimes it may cause bias and even failures of the theoretical methods develop by simulation environment. To conduct initial testing/validation of the methods, there are several large-scale CV/probe vehicle data sets at intersections that researchers can use, which including the Interaction dataset [179], highD dataset [180], and the inD dataset [181].
How and to what extent could the errors using realworld data influence the accuracy and robustness of the models are important research topics.

Energy Conservation
In the past decades, the transportation system has resulted in increased air pollution and energy waste issues. Passenger vehicles are a major pollution contributor, producing significant nitrogen oxides, carbon monoxide, and other pollution. The Environmental Protection Agency estimates that vehicles cause nearly 75 percent of the carbon monoxide pollution in the United States. China claims that the country would reach peak carbondioxide emissions by 2030 and achieve carbon neutrality before 2060. When vehicles pass through intersections, the complexity and uncertainty of the environment often lead to incomplete planning and decision-making, which causes congestion. Traffic congestion usually leads to high energy usage and air pollutant emissions. Therefore, the urgent need to reduce the transportation sector fuel consumption levels requires researchers and policymakers to develop various advanced energy conservation strategies for planning and decision-making at road intersections. There are two major types of research topics in this area. One focuses on the CAVs approaching process, and the other focuses on the CAVs leaving process. For CAVs approaching process, avoiding queuing and green wave driving are two common research topics. Ref. [117] developed an Eco-CACC algorithm that computes the fuel-optimum vehicle trajectory through a signalized intersection by ensuring that the vehicle arrives at the intersection stop bar just as the last queued vehicle is discharged. The result demonstrated that the proposed Eco-CACC system produces vehicle fuel savings up to 40%. Meanwhile, human driving error and mixed traffic are also discussed in some research to improve energy conservation further. As for CAVs leaving process, ecodeparture technologies draw researchers' attention to the CAV leaving process. Ref. [123] analyzed the ecodeparture of CAVs With V2X communication at signalized intersections. The results showed that eco-driving has the potential to save fuel for CAVs. However, its realization requires accurate control of the engine and transmission. Although there are several publications related to energy conservation at a road intersection, there is still lots of research and work that needs to be done in this area. The assumption of the existing papers is usually too idealistic. Maybe the research work's good starting point is to analyze the planning and decision-making under an unsignalized intersection environment.

Conclusions
This paper summarized the studies related to planning and decision-making technologies for CAVs at road intersections. The focus of the research is how to make AVs and CAVs pass the intersection safely and efficiently. We searched several digital libraries for papers published from the last five years and summarized the main themes. The review started with general planning and decision-making technologies, four major approaches are presented: graph-based approach, prediction-based approach, optimization-based approach and machine learning based approach. Then, the review summarized Vehicle-Infrastructure cooperation control method. With the wire-less communication technology development, some planning and decisionmaking methods deployed by the centralized controller at intersections to coordinate the CAVs to pass through the intersection through vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication. Finally, the review summarized planning and Decision-making method under mixed traffic. After carefully review the previous studies, the review identified the presented approaches' challenges and pointed out the open issues for futures research. Detailed discussions were also provided to present several future research needs and directions in this important area.