Falsification of Driving Systems ======================================= We provide a falsification pipeline that allows users to test their driving systems against the Rulebook specifications using Scenic programs. To run the falsification pipeline, users need to conduct the following steps: 1. **Interface the system under test with Scenic**. 2. **Define the Rulebook specifications that the system should satisfy**. 3. **Execute the falsification process to identify configurations where the system fails to meet the specifications**. Below we provide the detailed instructions for each step. (#i-interfacing-with-scenic)= I. Interfacing with Scenic --------------------------- To interface your driving system with Scenic, follow these steps: ### A. Define a Policy Agent Create a **Python** class that inherits the ``Agent`` class defined in [``src/evaluation/agent.py``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/agent.py). The class must implement a ``run_step`` method, which generates the control actions (e.g., throttle, brake, steer) for the next timestep. Below is a template for the policy agent: ```{code-block} python class ExampleAgent(Agent): def run_step(self, *args, **kwargs): ... throttle, brake, steer = ... # Generated by the driving policy under test return throttle, brake, steer ``` For a complete implementation example, please refer to ``MetaDrivePolicyAgent`` in [``src/evaluation/metadrive_expert_agent.py``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/metadrive_expert_agent.py), which leverages a [proximal policy optimization (PPO)-trained policy](https://metadrive-simulator.readthedocs.io/en/latest/action.html#ppoexpertpolicy) provided by the MetaDrive simulator. ### B. Define a Scenic Car Create a **Scenic** class that inherits from [Scenic's ``Car`` class](https://docs.scenic-lang.org/en/latest/modules/scenic.domains.driving.model.html#module-scenic.domains.driving.model). In the ``startDynamicSimulation`` method, assign an instance of your policy agent defined in step A (e.g., ``ExampleAgent``) to ``self.controller``. Below is a template for the car class: ```{code-block} scenic class ExampleCar(Car): controller: None def startDynamicSimulation(self): self.controller = ExampleAgent() ``` For a complete implementation example, please refer to ``MetaDrivePPOPolicyCar`` in [``src/evaluation/metadrive_expert.scenic``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/metadrive_expert.scenic). ### C. Define an Action Create a **Python** class that inherits the ``AgentAction`` class in [``src/evaluation/agent.py``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/agent.py). An action represents a single step of control that can be applied to the car. This class must implement an ``applyTo`` method, where the first argument agent is an instance of the car class defined in step B (e.g., ``ExampleCar``). Inside ``applyTo``, call ``agent.controller.run_step`` to retrieve control actions for the next timestep, then use Scenic's built-in methods (``setThrottle``, ``setBraking``, ``setSteering``, etc.) to apply them to the car. Below is a template for the action class: ```{code-block} python class ExampleAction(AgentAction): def applyTo(self, agent, *args, **kwargs): ... throttle, brake, steer = agent.controller.run_step(...) agent.setThrottle(throttle) agent.setBraking(brake) agent.setSteering(steer) ``` For a complete implementation example, please refer to ``MetaDrivePolicyAction`` in [``src/evaluation/metadrive_expert_agent.py``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/metadrive_expert_agent.py). ### D. Define a Scenic Behavior Create a **Scenic** behavior that repeatedly takes the action defined in step C (e.g., ``ExampleAction``). A behavior represents a sequence of actions that the car will execute during the simulation. In the behavior, use the ``take`` specifier to call the ``applyTo`` method defined in the action. Below is a template for the behavior: ```{code-block} scenic behavior ExampleBehavior(): action = ExampleAction() take action while True: take action ``` For a complete implementation example, please refer to ``MetaDrivePPOPolicyBehavior`` in [``src/evaluation/metadrive_expert.scenic``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/metadrive_expert.scenic). ### E. Pass State Information Most policies require access to the ego's current state or environment state to determine the next action. To facilitate this, follow these steps to supplement the classes defined above: #### E-1. Create Global Variables and Monitor Create global variables (e.g., ``SPEED``) to store the state information needed by the policy. Then, define a monitor that continuously updates this variable with the current state at each timestep. ```{code-block} scenic SPEED = [] monitor UpdateState(): while True: SPEED.append(ego.speed) wait ``` ```{note} The global variables should be lists, and the most recent state can be accessed by querying the last element (e.g., `SPEED[-1]`). This is due to Scenic's internal handling of variables. ``` For a complete implementation example, please refer to the ``METADRIVE_ACTOR`` variable and ``MetaDrivePPOUpdateState`` monitor in [``src/evaluation/metadrive_expert.scenic``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/metadrive_expert.scenic). #### E-2. Store State in the Car Class Add a member variable in the car class (e.g., ``ExampleCar``) to store the current state and update it at each timestep. ```{code-block} scenic :emphasize-lines: 2, 5, 6 class ExampleCar(Car): speed: None controller: None def update_actor(self): self.speed = SPEED[-1] def startDynamicSimulation(self): self.controller = ExampleAgent() ``` For a complete implementation example, please refer to ``MetaDrivePPOPolicyCar`` in [``src/evaluation/metadrive_expert.scenic``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/metadrive_expert.scenic). #### E-3. Pass State to the Policy Agent Modify the ``applyTo`` method of the action class (e.g., ``ExampleAction``) to pass the current state to ``run_step`` in the policy agent (e.g., ``ExampleAgent``). Optionally, you can pass the entire Scenic agent (e.g., ``ExampleCar``) if needed. ```{code-block} python :emphasize-lines: 4, 5 class ExampleAction(AgentAction): def applyTo(self, agent, *args, **kwargs): ... agent.update_actor() throttle, brake, steer = agent.controller.run_step(agent.speed, agent, ...) agent.setThrottle(throttle) agent.setBraking(brake) agent.setSteering(steer) ``` For a complete implementation example, please refer to ``MetaDrivePolicyAction`` in [``src/evaluation/metadrive_expert_agent.py``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/metadrive_expert_agent.py). ### F. Link the Behavior to the Car in the Scenic Program In the main **Scenic** program where the scenario is defined, create an instance of the car class defined in step B (e.g., ``ExampleCar``) and assign the behavior defined in step D (e.g., ``ExampleBehavior``) to it. If your policy requires state information, ensure that the monitors defined in step E-1 are active throughout the simulation. Below is a template: ```{code-block} scenic ego = new ExampleCar at ..., with ..., with behavior ExampleBehavior() require monitor UpdateState() ``` Using the MetaDrive PPO policy as an example: ```{code-block} scenic ego = new MetaDrivePPOPolicyCar at egoSpawnPt, with blueprint MODEL, with behavior MetaDrivePPOPolicyBehavior(egoTrajectory) require monitor MetaDrivePPOUpdateState() ``` (#ii-defining-rulebook-specifications)= II. Defining Rulebook Specifications ---------------------------------------- To define the Rulebook specifications that your system should satisfy, follow the instructions in the [Rules](rules.md) and [Rulebooks](rulebook.md) sections of the documentation. For complete examples, please refer to [`with_vru.graph`](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/assets/with_vru.graph) and [`no_vru.graph`](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/assets/no_vru.graph) in [``src/evaluation/assets/``](https://github.com/BerkeleyLearnVerify/ScenicRules/tree/main/src/evaluation/assets), which specify the Rulebook specifications for scenarios with and without VRUs, respectively. You can define your own specifications by modifying these graph files or creating new ones based on the same format. III. Running Falsification ----------------------------------------- Falsification is a simulation-based formal analysis method used to systematically identify failure cases in a system under test. Before running the falsification flow, you need to prepare the following input files: - **Scenic Program** (`*.scenic`): Describes the scenario and links to the driving system under test (see {ref}`#i-interfacing-with-scenic`). - **Rulebook Specification** (`*.graph`): Defines the hierarchical rulebook specifications (see {ref}`#ii-defining-rulebook-specifications`). The falsification process follows an iterative loop: 1. **Parameter Generation**: The falsifier generates a vector of input parameters. 2. **Simulation**: The simulator executes a simulation run based on these input parameters. 3. **Evaluation**: The resulting simulation trace is evaluated against the provided Rulebook specification. 4. **Feedback**: For active falsifier, the evaluation result serves as feedback to guide the falsifier in selecting the next set of parameters. 5. **Iteration**: Repeat Steps 1 to 4. ``` {note} In our framework, the input parameter space is defined directly within the main Scenic program using the `param` specifier. ``` Below we first provide a quick start on how to run the falsification flow, and then we detail the implementation of the flow. ### Quick Start Users can run the falsification flow on a Scenic scenario using the following command in [`src/evaluation`](https://github.com/BerkeleyLearnVerify/ScenicRules/tree/main/src/evaluation): ```{code-block} shell python run_evaluation.py \ --config-name=eval.yaml \ hydra.job.chdir=False \ hydra.output_subdir=null \ scenic.file_path=SCENARIO_PATH ``` where `eval.yaml` defines the configurations of the falsification flow, and `SCENARIO_PATH` is the path to the Scenic program. The `eval.yaml` file should locate at [`src/evaluation/cfgs`](https://github.com/BerkeleyLearnVerify/ScenicRules/tree/main/src/evaluation/cfgs). Users can modify the provided [`eval.yaml`](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/cfgs/eval.yaml) based on their needs. Here is an example alongside the description of the configuration file: ```{code-block} yaml hydra: run: dir: ./outputs/${now:%Y-%m-%d}/ # The output folder of the log file job_logging: handlers: file: class: logging.FileHandler formatter: simple filename: ${hydra:run.dir}/run_evaluation.log # The output log file name agent: type: 'built_in' # The driving system under test. In our scenarios, we provide three driving policies: 'built_in', 'metadrive_ppo', and 'ppo_with_built_in'. If you define your own driving policy, remember to change the type here. scenic: scenario_name: 'crash_apple_082321' # Name of the scenario file_path: '../../scenarios/crash/crash_apple_082321.scenic' # The path to the Scenic scenario program map_file_path: '../../maps/Town04.net.xml' # The path to the map file simulator: 'metadrive' # The simulator. Our framework is mainly developed and tested on 'metadrive', but other simulators that are interfaced with Scenic can also be adopted, including 'newtonian' and 'carla'. max_steps_per_simulation: 180 # The maximum simulation steps of a simulation run. max_rejection_sampling_iterations: 2000 # The maximum rejection sampling iterations for Scenic. timeout: 40 # Timeout (in seconds) for each simulation run. The simulation terminates after this duration has elapsed. max_retries: 5 # Max retries for each simulation. The entire falsification flow terminates after this number of retries is reached. rulebook: file_path: 'assets/with_vru.graph' # The path to the Rulebook specification file. add_reaching_goal_rule: True # Whether to add a reaching-goal rule to the Rulebook. As the reaching-goal rule is a highly scenario-dependent rule, we treat is as a standalone option in our implementation. falsification: sampler_type: 'mab' # The falsification sampler. Options include 'random', 'halton', 'ce' (cross-entropy), and 'mab' (multi-armed bandit). active: True # Whether the sampler is active. True for 'ce' and 'mab', False for 'random' and 'halton'. params: # Paratemers for active samplers. Only effective when active is True. alpha: 0.9 thres: 0.0 buckets: 5 exploration_ratio: 2 # This option is only for 'mab'. experiment: rng_seed: 1 # The random seed for the entire falsification flow. num_seeds: 1 # The number of random seeds to run. num_samples: 30 # The number of samples to generate for each random seed. # Note that the total number of simulation runs will be num_seeds * num_samples. visualization: record_simulation: False # Whether to record the simulation. If True, the simulation will be recorded and saved as a video file in the record_dir. ids: ['egoPoly', 'egoLanePoly', 'advPoly', 'advLanePoly'] # The IDs of the objects to be visualized. Each object should be recorded with the "record" statement in the Scenic program using the same ID. For example, "record ego._boundingPolygon as egoPoly". record_dir: ./outputs/${now:%Y-%m-%d}/ # The output folder for the recorded videos. ``` ### Detailed Implementation of the Flow The falsification flow is implemented in [``src/evaluation/run_evaluation.py``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/evaluation/run_evaluation.py). The main falsification process is implemented in the `run_evaluation` function. The process includes the following steps: 1. **Initialization**: The function initializes the simulator, the input parameter space and the falsification sampler, and the Rulebook specification. 2. **Falsification Loop**: The function enters a loop that iterates for a specified number of samples. In each iteration, it generates a vector of input parameters using the falsification sampler, runs a simulation with these parameters, evaluates the resulting trajectory against the Rulebook specification, and collects the evaluation results. If the sampler is active, it also updates the sampler with the evaluation results to guide the next parameter generation. 3. **Logging**: The function logs the evaluation results and other relevant information. #### Falsifier We provide several sampling algorithms for the falsifier, including random sampling (`random`), Halton sequence sampling (`halton`), cross-entropy method (`ce`), and multi-armed bandit (`mab`). The samplers are imported from the [VerifAI](https://verifai.readthedocs.io/en/latest/) toolkit or implemented in [``src/rulebook_benchmark/samplers.py``](https://github.com/BerkeleyLearnVerify/ScenicRules/blob/main/src/rulebook_benchmark/samplers.py). #### Agent We provide three driving policies for evaluation: `built_in`, `metadrive_ppo`, and `ppo_with_built_in`. The `built_in` policy uses the Scenic built-in rule-based policy with PID control. The `metadrive_ppo` policy uses a PPO-trained policy provided by MetaDrive. The `ppo_with_built_in` policy uses the rule-based policy for high-level decision making and the PPO policy for low-level control. To use the `metadrive_ppo` or `ppo_with_built_in` policy, users need to download the PPO policy weights from the [MetaDrive repository](https://github.com/metadriverse/metadrive/blob/main/metadrive/examples/ppo_expert/expert_weights.npz) and place the weights file in the `src/evaluation/assets` folder. To use your own driving policy, you can create a new agent class following the instructions in {ref}`#i-interfacing-with-scenic` and update the `agent.type` field in the configuration file accordingly.