Introduction to Large Scale Modeling Systems

Introduction toLarge Scale Modeling Systems
CinziaCirillo, Ph.D., Associate Professor Department of Civil and Environmental Engineering University of Maryland Sept 3rd, 2013 Heart 2013 – Summer School Stockholm

About myself MS Civil Engineering Universita “Federico II” – Naples (ITALY) PhD Transportation Engineering Politecnicodi Torino – Torino (ITALY) Stagiare and Consultant Hague Consulting Group The Hague (NL) and Cambridge (UK) Post Doc – Marie Curie fellowship (EU) Applied MATH - University of Namur (BELGIUM) Assistant and now Associate Professor (with tenure) Department of Civil and Environmental Engineering University of Maryland (USA) This year on sabbatical leave at TU-Delft (NL)

My students at UMD…(Pratt, Michael, JM, Nayel, Renting, Me, Yangwen)

Table of Contents Terminology Four-step models Tour-based models Activity-based models Integrated land use and transportation models

Terminology of Network Representation Zones (Centroid, Centroid Connectors) Nodes Links

Four-Step Trip-Based Travel Demand Model

What are the Four Steps? Trip Generation (Ti) – Number of trips produced in and attracted to zone “I” [Number of trips that will be generated] Trip Distribution (Tij) – Number of trips produced in zone “i” and attracted to zone “j” [Where the trips might go] Mode Split (Tijm) – Number of trips produced in zone “i” and attracted to zone “j” traveling by mode “m” [Which mode of transportation do travelers choose – automobile, rail, bus, bicycle, etc.] Traffic Assignment (Tijmr) – Number of trips produced in zone “i” and attracted to zone “j” traveling by mode “m” over route “r” [Predicts the path the trips will take]

Trip Generation Model: Terminology Trip Trip Ends Tours Home-Based Trip Non-Home-Based Trip Trip Production Trip Attraction Trip Generation Trip Purpose

Methods for Trip Generation Modeling Growth Factor Analysis Often used for external trip generation modeling Cross Classification Methods Most widely used in current practices Regression Models Zone-level regression Household level regression Combining Cross-Classification and Regression Methods Trip Rate Analysis Often used for special trip generators ITE Trip Generation Manual Matching Trip Generations and Attractions

Cross Classification Models

Zone-Level Regression Analysis: Example Pi = 22.4 + 1.87HHi + 0.22Ai Production Aj = 57.2 + 0.87Ej + 0.15Rj Attraction Pi: Total number of HBW trips produced from zone i Aj: Total number of HBS trips attracted to zone j HHi: Total number of households in zone i Ai: Total number of automobiles in zone i Ej: Total employment in zone j Rj: Total retail space in zone j

Matching Productions and Attractions Pi: number of trips produced from zone i Aj: number of trips attracted to zone j P’i adjusted number of trips produced from zone i A’j: adjusted #trips attracted to zone j i, j: index of zones

Methods for Trip Distribution Modeling Growth Factor Analysis Use it only if there are no other better feasible method Synthetic Model (e.g. Gravity Model) Most widely used in practice Discrete Choice Model More flexible model structure and behaviorally rich Gravity model can be shown as a special case of discrete choice model Statistical/Optimization Methods for Estimating OD Trip Tables from Traffic Counts Intervening Opportunity Model Etc.

Basic Gravity Model Idea comes from Newton’s Law of Gravitation Where: Tij: Number of trips from to Pi: Productions at i Aj: Attractions at j Fij: A function of travel time, distance, and/or cost e.g. Fij = 1/(Cij)2 , or Fij = exp(b*Cij) Kij: Socioeconomic factor (specified by the modeler)

Model Choice Decision maker Alternatives Attributes of alternatives Decision rule Conjunctive rules, e.g. Satisfaction Disjunctive rules, e.g. A set of if-then rules Lexicographical rules, e.g. Dominance Compensatory rules, e.g. Utility maximization Combination of rules, e.g. Elimination by aspects Other Heuristic Decision Rules Etc

Utility Maximization Theory U1n = U(t1n, c1n) = b1t1n + b2c1n U2n = U(t2n, c2n) = b1t2n + b2c2n Individual n chooses alternative 1 if U1n>U2n When there are multiple alternatives, individual n chooses alternative 1 if U1n > Uin of all other alternatives.

Random Utility Maximization Theory U1n = U(t1n, c1n) = b1t1n + b2c1n + e1n U2n = U(t2n, c2n) = b1t2n + b2c2n + e2n Let V1n = b1t1n + b2c1n V2n = b1t2n + b2c2n U1n = V1n + e1n U2n = V2n + e2n V: Systematic utility e: Random utility

Discrete Choice Models Binary: Prob (1) = P(U1 > U2) = P (V1n + e1n > V2n + e2n) = P (e2n – e1n < V1n – V2n ) Multinomial: Prob (i) = P(Ui > Max Uj,j≠i) If e is assumed to be normally distributed, a Probit choice model is obtained. If e is assumed to be logistically distributed, a Logit choice model is obtained.

Traffic Assignment / Equilibrium Supply: Travel cost = f (Travel demand) Demand: Travel demand = f (Travel cost) An equilibrium is achieved when both supply and demand equations are simultaneously satisfied Wardrop’s Two Traffic Equilibrium Principles First Principle: User Equilibrium (UE) Each user acts to minimize his/her own travel cost. At UE, all used routes between each OD pair have equal travel costs, while all unused routes have higher travel costs. Second Principle: System Optimal (SO) Each user acts to minimize the total travel cost in the system. At SO, the lowest total system travel cost is achieved.

Classification of Traffic Assignment

Tour-Based and Activity-Based Models

Activity-Based Models Recognize… Travel is a derived demand Spatial, temporal, transportation and interpersonal interdependencies constrain activity/travel behavior Household and other social factors/structures influence travel and activity behavior Activity-based approaches aim at predicting which activities are conducted where, when, for how long, with whom, the transport mode involved and ideally also the implied route decisions.

Typical Specification of ABM: Type 1 Population synthesis and updating Mobility-lifestyle choices (auto, home location etc.) Day-level activity pattern generation (List of Activities with or without sequencing) Scheduling of activities Activity or tour-based mode and destination choices

Typical Specification of ABM: Type 2 Population synthesis and updating Mobility-lifestyle choices (auto, home location etc.) Day-level activity pattern generation (Primary and secondary tours and their sequencing) Tour-level primary activity destination, mode, and scheduling choices Stop-level secondary activity destination, mode, and scheduling choices

ABM History HATS (Jones 1979) CARLA (Jones et al. 1983) STARCHILD (Reckeret al. 1986a, 1986b) SCHEDULER (Garling et al. 1989) SMASH (Ettemaet al. 1993) SAMS and AMOS (Kitamura et al. 1993, RDC Inc. 1995, Kitamura et al. 1996) MIDAS (Kitamura and Goulias 1989, Goulias and Kitamura 1996) SMART (Stopher et al. 1996) GISICAS (Kwan 1997) PCATS (Kitamura and Fujii 1998) ALBATROSS (Arentze and Timmermans 2000) PETRA (Fosgerau 2001) SIMAP (Kulkarni and McNally 2001) TASHA (Miller and Roorda 2003) CEMDAP ( Bhatet al. 2004) FAMOS (Pendyala et al. 2004) TRANSIMS (Los Alamos National Laboratory 2005)

ABM in Practice U.S. Atlanta, GA Boston, MA Columbus, OH Dallas, TX Denver, CO New York, NY Portland, OR Sacramento, CA San Francisco, CA Southeast Florida Statewide in Oregon International Netherlands Swiss Germany Chile Etc

ABM Benefits Predicts travel behavior along a continuous time axis and scheduling adjustments; Assesses the impact of sophisticated travel demand management measures; Can be easily modified to evaluate policy scenarios with or without new SP surveys (e.g. extended transit service, dynamic pricing, daycare facilities at work, flexible work hours); Produces results with desired level of spatial and temporal accuracy using synthetic population sample; More comprehensively evaluates the impact of transportation projects and policies on the entire activity-travel pattern not trip travel pattern, just on a trip.

ABM Data Needs Demand Side Longitudinal and geographic information on household or individual time use (e.g. type of activities, travel, activity locations, activity duration, scheduling); Socio-demographic information (e.g. household composition, age, gender, job, income, housing); Auto-ownership and other household mobility and lifestyle choices; Activity-travel pattern changes/shifts over time and in response to transportation system changes; Household characteristics with regard to telecommunication.

ABM Data Needs Supply Side Transportation networks coded to the activity-stop level; • Level of service of the transportation network by time of day (this could be endogenous with DTA); Daily, day-of-the-week, and seasonal activity time windows (e.g. store open hours, periods during which specific activities can be pursued); Spatial and non-spatial inventory of activity locations, land use, and economic data.

ALBATROSS (Arentze and Timmermans 2000, 2004) Albatross: A learning based transportation oriented simulation system The model predicts which activities are conducted when, where, for how long, with whom and also transport mode Decision tree is proposed as a formalism to model the heuristic choice Considers various constraints on behavior: Situational constraints: can’t be in two places at the same time Institutional constraints: such as opening hours Household constraints: such as bringing children to school Spatial constraints: e.g. particular activities cannot be performed at particular locations Time constraints: activities require some minimum duration Spatial temporal: constraints an Spatial-individual cannot be at a particular location at the right time to conduct a particular activity

ALBATROSS (Arentze and Timmermans 2000, 2004) Albatross assumes that choice behavior is based on rules that are formed and continuously adapted through learning while the individual is interacting with the environment (reinforcement learning) or communicating with others (social learning). Options for rule-based behavior representation: Decision trees (used in Albatross) Classification rules Bayesian network Etc.

Albatross Model Flowchart Each oval represents a decision tree

CEMDAP (Bhat et al. 2003) CEMDAP: Comprehensive Econometric Micro-simulator for Daily Activity-travel Patterns” A system of econometric models that represent the activity-travel decision-making behavior of individuals. Input: Various land-use, socio-demographic, activity system, and transportation level-of-service attributes Output: Complete daily activity-travel patterns for each individual in the household.

Daily Activity-Travel Pattern: Worker

Daily Activity-Travel Pattern: Non-Worker

CEMDAP Modeling Framework

Activity Generation-Allocation Module

Activity Generation-Allocation Models

Pattern/Tour/Stop-Level Scheduling Modules

Pattern/Tour/Stop-Level Scheduling Models

Activity-Based Model Applications Effects of development patterns on travel behavior Sensitivity to price and behavioral changes Effects of transportation system and system condition Need for improved validity and reliability Ability to evaluate policy initiatives Better analysis of freight movement Ability to show environmental effects Modeling low-share alternatives Better ability to evaluate effects on specific subgroups Reflect non-system policy changes (TDM, ITS)

Transportation Eras and Urban Growth Patterns

Integrated land use and transportation models

Population Density vs. Distance to City Center

Population Density vs. %Transit Mode Share

Transportation and Land Use

A More Detailed Theoretical Framework

Land Use Model Components Input: Total population and total employment by type in the study area Output: Population and employment by type in each spatial analysis unit Typical Spatial Analysis Unit: TAZ, Census tract, Parcel, Block, Grid cell Demand Modules: Household location choice, Employment location choice, and/or Household/employment relocation choice Supply Modules: Housing development, business real estate development Balancing Supply and Demand: No balancing, Price and equilibrium, Disequilibrium

Land Use-Transportation Microsimulation

UrbanSim

The Travel/Activity Scheduler for Household Agents (TASHA) model

Thank you! Q&A

Integrated Discrete Continuous Choice Models Theory and Applications Household Vehicle Ownership, Type and Usage

Table of Contents

Motivation In the U.S., transportation contributes approximately 27 percent of total greenhouse gas emissions. 71 percent of the oil consumption directs to fuels used in transportation, in which 40 percent is used to fill up gasoline tanks in our personal vehicles. The American households are highly dependent on private vehicles – in 2009, the average vehicle ownership per household is 2.05, and there are only about 5% of the households who do not have a car. The use of private vehicles has strong relationship with traffic congestion, energy consumption and our environment. Therefore, it is very crucial to understand the people’s behavior on the wheels, particularly, how many vehicles they own, the types of the vehicles and how many miles they travel. In fact, households make those decisions simultaneously. As transportation modelers, we’d better to estimate the decisions in one system, in stead of separately, in order to best understand their travel behavior hence provide better reference for the policy makers. However, in the literature there are only a few studies that investigated the three choices jointly.

Literature Review Discrete-continuous models derived from conditional indirect utility function (i.e., Train, 1986) In the discrete part, the utilities of the alternatives are represented by conditional indirect utility functions, and the person will choose the alternative with the highest utility. In the continuous part, the demand functions are derived from the conditional indirect utility functions by using Roy’s identity property. Limitations: The models estimate the choice probabilities and the demand equations sequentially, not simultaneously . The estimates are consistent but not as efficient as full information maximum likelihood, because the unobserved component of utility and the error in the demand equation generally contain some common unobserved factors.

Literature Review (Con’t) Multiple Discrete Continuous Extreme Value (MDCEV) model Limitations: Does not include vehicle holding decision. Requires fine classification of vehicles as one type of vehicle cannot be chosen twice by the household. The assumption of fixed total mileage budget for every household implies that it is not possible to predict changes in the total number of miles in response to policy changes. There is only a single error term underlying both discrete and continuous choices.

Literature Review (Con’t) Bayesian Multiple Ordered Probit and Tobit (BMOPT) Model Limitations: The computation becomes intensive for a large number of vehicle categories, as the number of equations to be estimated increases proportionally with the number of vehicle types. Ordered mechanism may not perform as well as unordered mechanism in modeling car ownership decisions (Bhat and Pulugurta, 1998; Potoglou and Kanaroglou, 2008)

Research Objectives Develop a mathematical framework to model the household choices on vehicle ownership, the types and annual mileage traveled; in particular, the model should be able to simultaneously estimate discrete (vehicle holding and types) and continuous (vehicle usage) decision variables; take into account a large number of alternatives in both the vehicle holding and the vehicle type choices; have no budget on the mileage traveled; capture the correlations of the unobserved factors between the discrete and continuous parts; have flexible specifications; and be sensitive to policy analysis. In addition, investigate the performance of ordered and unordered structures in discrete-continuous models.

Unordered Discrete-Continuous Model 0 1 - Type1 Number of vehicles & the type of each 2 - Type1 & Type2 3 - Type1 & Type2 & Type3 4 - Type1 & Type2 & Type3 & Type4 Household Annual miles traveled

Unordered Discrete-Continuous Model (Con’t) In the unordered structure, the household is assumed to be rational and to choose the alternative of vehicle ownership level that maximizes its utility.

Unordered Discrete-Continuous Model (Con’t) The discrete choices Y– Multinomial Probit Where, The continuous choice Yreg– Regression

Unordered Discrete-Continuous Model (Con’t) The integrated discrete-continuous model:

Unordered Discrete-Continuous Model (Con’t) Estimation with Monte Carlo Simulation: Where is a draw from a multivariate normal with mean and variance Then, the final Simulated Log Likelihood of the model is:

Unordered Discrete-Continuous Model (Con’t) Estimation with Numerical Computation (Genz ,1992):

Ordered Discrete-Continuous Model The ordered response structure uses latent variables to represent the vehicle ownership propensity of the household. Suppose two latent variables yd and yr represent the preference levels for vehicle holding and vehicle usage: The number of vehicles holding by the household (Y ) is determined by the value of latent variable yd, specifically:

Ordered Discrete-Continuous Model (Con’t) Similarly, in order to jointly to capture the correlation between the discrete and continuous parts, we allow the error terms to be correlated. Thus, the error terms follow a bivariate normal distribution: The likelihood of one observation is Where

Ordered Discrete-Continuous Model (Con’t) Tthe conditional mean and variance of ordered probit are: The final likelihood of one observation can be written as: where,

Case Study Data sources: 2009 National Household Travel Survey (NHTS) data – 1420 observations in the Washington D.C. Metropolitan area Vehicle characteristics Choice set: Vehicle holding: 0, 1, 2, 3 and 4 car(s) Vehicle type: 120 alternatives for the type choice of each vehicle (12 classes x 10 vintages) Vehicle usage: annual miles traveled

Subsample of chosen alternative plus 20 randomly selected ones 12 classes of vehicle for each 10 vintages—Total of 120 alternatives The classes of vehicles are small domestic car; compact domestic car; mid-size domestic car; large domestic car; luxury domestic car; small import car; mid-size import car; large import car; sporty car; minivan/van; pickup trucks; SUVs. The 10 vintages are pre-1999 and the years 2000 through 2008.

Case Study (Con’t) Data Statistics

Estimations of Vehicle Type Sub-models

Estimations of Vehicle Type Sub-models (Con’t)

Model Estimations unordered discrete-continuous model with simulation unordered discrete-continuous model without simulation Ordered discrete-continuous model Same as Model 2 except no logsum (utility from the type choices)

Model Estimations (Con’t) *Note: Model 1 is the unordered discrete-continuous model with simulation; Model 2 is the unordered discrete-continuous model with numerical computation; Model 3 is the ordered discrete-continuous model; Model 4 is the same as Model 2 except excluding the "logsum" variable, which make it comparable to Model 3.

Model Estimations (Con’t) 1 car 2 cars 3 cars 4 cars Mileage 1 car 2 cars 3 cars 4 cars Mileage 1 car 2 cars 3 cars 4 cars Mileage 1 car 2 cars 3 cars 4 cars Mileage #cars mileage #cars mileage 1 car 2 cars 3 cars 4 cars Mileage 1 car 2 cars 3 cars 4 cars Mileage

Model Applications

Model Applications (Con’t)

Conclusions Developed an integrated discrete continuous choice model to simultaneously estimate the household choices on vehicle ownership (discrete), the types (discrete) and annual mileage traveled (continuous). The model is able to include a large number of alternatives in both the vehicle holding and the vehicle type choices. The model allows unrestricted correlations of the unobserved factors between the discrete and continuous parts. The model accommodates flexible specifications. There is no budget constraint in the mileage traveled. The model can be applied for policy analysis.

Conclusion (Con’t) The case study for the Washington D.C. Metropolitan area is based on the latest national dataset – 2009 NHTS The preliminary results show that the model gives reasonable estimates of the coefficients; the covariance matrix well explains the correlations between the unobserved factors from the utilities of the discrete choices and the demand function of the continuous choice; the non-simulation approach provides better model fit; the performance of the model is improved if the information about vehicle type choice is included; the unordered discrete continuous model is more appropriate in estimating household vehicle ownership and usage decisions, than the ordered discrete continuous model .

Introduction to Large Scale Modeling Systems