System Design of Uber App - Uber System Architecture - GeeksforGeeks
Mai 17, às 17:37
16 min de leitura
The Uber app is a ride-sharing service that allows passengers to request a ride from drivers in the area. The app handles all aspects of the ride, from booking to payment. Here is a high-level architecture of the Uber system:
The user interface allows passengers to book rides and drivers to accept or reject ride requests. This could be a mobile app or a web app.
The application server receives ride requests from passengers and sends them to available drivers in the area. The server also manages the communication between drivers and passengers, handles ride pricing, and provides real-time GPS tracking.
The driver app allows drivers to accept or reject ride requests, track their location and destination, and receive payments for completed rides. It also provides a rating system for passengers and drivers.
The passenger app allows passengers to request rides, track their location and destination, view ride history, and make payments. It also provides a rating system for drivers and passengers.
The payment gateway handles all payment transactions between passengers and drivers. It should be secure and reliable, and support various payment methods, such as credit cards, PayPal, or mobile wallets.
The Uber system should track various metrics such as ride requests, cancellations, payments, ratings, and driver availability. This data can be used to optimize the service and provide insights into user behavior.
A geospatial database is used to store the real-time location data of drivers and passengers. This data is used to match ride requests with available drivers in the area.
The map service provides real-time maps and location data for the driver and passenger apps. It should be fast, reliable, and provide accurate routing and directions.
The messaging service provides real-time messaging between drivers and passengers. It should be secure and reliable and support multiple channels such as SMS, push notifications, or in-app messaging.
The Uber system should be designed to prevent fraud and protect the personal and financial data of passengers and drivers. It should use encryption, authentication, and authorization mechanisms to ensure the security of the service.
The Uber system requires a robust infrastructure to handle a large number of concurrent users and transactions. This includes servers, storage, networking, and load balancing components.
Overall, the Uber system architecture is a complex and highly scalable distributed system that requires the use of multiple technologies and tools to ensure reliability, scalability, and security.
It’s really easy to just tap a button on our mobile phone and get the cab available within a few minutes whenever and wherever we want.
Uber/Ola/Lyft… using these applications and getting the hassle-free transportation service is really simple but is it also simple to build these gigantic applications which have hundreds of software engineers working on them for a decade…? definitely not. These systems have much more complex architecture and there are a lot of components joined together internally to provide riding services all over the world. Designing Uber (or OLA or Lyft) is a quite common question in system design round in interviews. A lot of candidates get afraid of this round more than the coding round because they don’t get the idea that what topics and tradeoffs they should cover within this limited timeframe. Firstly, remember that the system design round is extremely open-ended and there’s no such thing as a standard answer. Even for the same question, you’ll have a totally different discussion with different interviewers.
In this blog, we will discuss how to design ride-hailing services like Uber/Ola/Lyft but before we go further we want you to read the article “How to crack system design round in interviews?”. It will give you an idea that what this round looks like, what you are expected to do, and what mistakes you should avoid in front of the interviewer.
Uber System Architecture
We all are familiar with Uber services. A user can request a ride through the application and within a few minutes, a driver arrives nearby his/her location to take them to their destination. Earlier Uber was built on the “monolithic” software architecture model. They had a backend service, a frontend service, and a single database. They used Python and its frameworks and SQLAlchemy as the ORM layer to the database. This architecture was fine for a small number of trips in a few cities but when the service started expanding in other cities Uber team started facing the issue with the application. After the year 2014 Uber team decided to switch to the “service-oriented architecture” and now Uber also handles food delivery and cargo.
1. Talk About the Challenges
One of the main tasks in Uber service is to match the rider with cabs which means we need two different services in our architecture i.e.
- Supply Service (for cabs)
- Demand Service (for riders)
Uber has a Dispatch system (Dispatch optimization/DISCO) in its architecture to match supply with demands. This dispatch system uses mobile phones and it takes the responsibility to match the drivers with riders (supply to demand).
2. How Dispatch System Work?
DISCO must have these goals…
- Reduce extra driving.
- Minimum waiting time
- Minimum overall ETA
The dispatch system completely works on maps and location data/GPS, so the first thing which is important is to model our maps and location data.
- Earth has a spherical shape so it’s difficult to do summarization and approximation by using latitude and longitude. To solve this problem Uber uses the Google S2 library. This library divides the map data into tiny cells (for example 3km) and gives the unique ID to each cell. This is an easy way to spread data in the distributed system and store it easily.
- S2 library gives coverage for any given shape easily. Suppose you want to figure out all the supplies available within a 3km radius of a city. Using the S2 libraries you can draw a circle of 3km radius and it will filter out all the cells with IDs that lie in that particular circle. This way you can easily match the rider to the driver and you can easily find out the number of cars(supply) available in a particular region.
3. Supply Service And How it Works?
- In our case cabs are the supply services and they will be tracked by geolocation (latitude and longitude). All the active cabs keep on sending the location to the server once every 4 seconds through a web application firewall and load balancer. The accurate GPS location is sent to the data center through Kafka’s Rest APIs once it passes through the load balancer. Here we use Apache Kafka as the data hub.
- Once the latest location is updated by Kafka it slowly passes through the respective worker notes’ main memory.
- Also, a copy of the location (state machine/latest location of cabs) will be sent to the database and to the dispatch optimization to keep the latest location updated.
- We also need to track a few more things such as the number of seats, presence of a car seat for children, type of vehicle, can a wheelchair be fit, and allocation ( for example, a cab may have four seats but two of those are occupied.)
4. Demand Service And How it Works?
- Demand service receives the request of the cab through a web socket and it tracks the GPS location of the user. It also receives different kinds of requirements such as the number of seats, type of car, or pool car.
- Demand gives the location (cell ID) and user requirement to supply and make requests for the cabs.
5. How Dispatch System Match the Riders to Drivers?
- We have discussed that DISCO divides the map into tiny cells with a unique ID. This ID is used as a sharding key in DISCO. When supply receives the request from demand the location gets updated using the cell ID as a shard key. These tiny cells’ responsibilities will be divided into different servers lies in multiple regions (consistent hashing). For example, we can allocate the responsibility of 12 tiny cells to 6 different servers (2 cells for each server) lying in 6 different regions.
- Supply sends the request to the specific server based on the GPS location data. After that, the system draws the circle and filters out all the nearby cabs which meet the rider’s requirement.
- After that, the list of the cab is sent to the ETA to calculate the distance between the rider and the cab, not geographically but by the road system.
- The sorted ETA is then sent back to the supply system to offer to a driver.
If we need to handle the traffic for the newly added city then we can increase the number of servers and allocate the responsibilities of newly added cities’ cell IDs to these servers.
6. How To Scale Dispatch System?
- The dispatch system (including supply, demand, and web socket) is built on NodeJS. NodeJS is the asynchronous and event-based framework that allows you to send and receive messages through WebSockets whenever you want.
- Uber uses an open-source ringpop to make the application cooperative and scalable for heavy traffic. Ring pop has mainly three parts and it performs the below operation to scale the dispatch system.
- It maintains the consistent hashing to assign the work across the workers. It helps in sharding the application in a way that’s scalable and fault-tolerant.
- Ringpop uses RPC (Remote Procedure Call) protocol to make calls from one server to another server.
- Ringpop also uses a SWIM membership protocol/gossip protocol that allows independent workers to discover each other’s responsibilities. This way each server/node knows the responsibility and the work of other nodes.
- Ringpop detects the newly added nodes to the cluster and the node which is removed from the cluster. It distributes the loads evenly when a node is added or removed.
7. How does Uber Defines a Map Region?
Before launching a new operation in a new area, Uber onboarded the new region to the map technology stack. In this map region, we define various subregions labeled with grades A, B, AB, and C.
Grade A: This subregion is responsible to cover the urban centers and commute areas. Around 90% of Uber traffic gets covered in this subregion, so it’s important to build the highest quality map for subregion A.
Grade B: This subregion covers the rural and suburban areas which are less populated and less traveled by Uber customers.
Grade AB: A union of grade A and B subregions.
Grade C: Covers the set of highway corridors connecting various Uber Territories.
8. How does Uber Builds the Map?
Uber uses a third-party map service provider to build the map in their application. Earlier Uber was using Mapbox services but later Uber switched to Google Maps API to track the location and calculate ETAs.
1. Trace coverage: Trace coverage spot the missing road segments or incorrect road geometry. Trace coverage calculation is based on two inputs: map data under testing and historic GPS traces of all Uber rides taken over a certain period of time. It covers those GPS traces onto the map, comparing and matching them with road segments. If we find missing road segments (no road is shown) on GPS traces then we take some steps to fix the deficiency.
2. Preferred access (pick-up) point accuracy: We get the pickup point in our application when we book the cab in Uber. Pick-up points are a really important metric in Uber, especially for large venues such as airports, college campuses, stadiums, factories, or companies. We calculate the distance between the actual location and all the pickup and drop-off points used by drivers.
Image Source: https://eng.uber.com/maps-metrics-computation/
The shortest distance (closest pickup point) is then calculated and we set the pin to that location as a preferred access point on the map. When a rider requests the location indicated by the map pin, the map guides the driver to the preferred access point. The calculation continues with the latest actual pick-up and drop-off locations to ensure the freshness and accuracy of the suggested preferred access points. Uber uses machine learning and different algorithms to figure out the preferred access point.
9. How ETAs Are Calculated?
ETA is an extremely important metric in Uber because it directly impacts ride-matching and earnings. ETA is calculated based on the road system (not geographically) and there are a lot of factors involved in computing the ETA (like heavy traffic or road construction). When a rider requests a cab from a location the app not only identifies the free/idle cabs but also includes the cabs which are about to finish a ride. It may be possible that one of the cabs which are about to finish the ride is more close to the demand than the cab which is far away from the user. So many uber cars on the road send GPS locations every 4 seconds, so to predict traffic we can use the driver’s app’s GPS location data.
We can represent the entire road network on a graph to calculate the ETAs. We can use AI-simulated algorithms or simple Dijkstra’s algorithm to find out the best route in this graph. In that graph, nodes represent intersections (available cabs), and edges represent road segments. We represent the road segment distance or the traveling time through the edge weight. We also represent and model some additional factors in our graph such as one-way streets, turn costs, turn restrictions, and speed limits.
Once the data structure is decided we can find the best route using Dijkstra’s search algorithm which is one of the best modern routing algorithms today. For faster performance, we also need to use OSRM (Open Source Routing Machine) which is based on contraction hierarchies. Systems based on contraction hierarchies take just a few milliseconds to compute a route — by preprocessing the routing graph.
Uber had to consider some of the requirements for the database for a better customer experience. These requirements are…
- The database should be horizontally scalable. You can linearly add capacity by adding more servers.
- It should be able to handle a lot of reads and writes because once every 4-second cabs will be sending the GPS location and that location will be updated in the database.
- The system should never give downtime for any operation. It should be highly available no matter what operation you perform (expanding storage, backup, when new nodes are added, etc).
Earlier Uber was using the RDBMS PostgreSQL database but due to scalability issues uber switched to various databases. Uber uses a NoSQL database (schemaless) built on top of the MySQL database.
- Redis for both caching and queuing. Some are behind Twemproxy (which provides scalability of the caching layer). Some are behind a custom clustering system.
- Uber uses schemaless (built in-house on top of MySQL), Riak, and Cassandra. Schemaless is for long-term data storage. Riak and Cassandra meet high-availability, low-latency demands.
- MySQL database.
- Uber is building their own distributed column store that’s orchestrating a bunch of MySQL instances.
To optimize the system, minimize the cost of the operation and for better customer experience uber does log collection and analysis. Uber uses different tools and frameworks for analytics. For log analysis, Uber uses multiple Kafka clusters. Kafka takes historical data along with real-time data. Data is archived into Hadoop before it expires from Kafka. The data is also indexed into an Elastic search stack for searching and visualizations. Elastic search does some log analysis using Kibana/Graphana. Some of the analyses performed by Uber using different tools and frameworks are…
- Track HTTP APIs
- Manage profile
- Collect feedback and ratings
- Promotion and coupons etc
- Fraud detection
- Payment fraud
- Incentive abuse by a driver
- Compromised accounts by hackers. Uber uses historical data of the customer and some machine learning techniques to tackle this problem.
12. How To Handle The Datacenter Failure?
Datacenter failure doesn’t happen very often but Uber still maintains a backup data center to run the trip smoothly. This data center includes all the components but Uber never copies the existing data into the backup data center.
Then how does Uber tackle the data center failure??
It actually uses driver phones as a source of trip data to tackle the problem of data center failure.
When The driver’s phone app communicates with the dispatch system or the API call is happening between them, the dispatch system sends the encrypted state digest (to keep track of the latest information/data) to the driver’s phone app. Every time this state digest will be received by the driver’s phone app. In case of a data center failure, the backup data center (backup DISCO) doesn’t know anything about the trip so it will ask for the state digest from the driver’s phone app and it will update itself with the state digest information received by the driver’s phone app.
Want to get a Software Developer/Engineer job at a leading tech company? or Want to make a smooth transition from SDE I to SDE II or Senior Developer profiles? If yes, then you’re required to dive deep into the System Design world! A decent command over System Design concepts is very much essential, especially for the working professionals, to get a much-needed advantage over others during tech interviews.
Advantages of the Uber system architecture:
- Scalability: The system can handle a large number of concurrent users and transactions, making it easy to scale as the user base grows.
- Real-time GPS tracking: The system provides real-time GPS tracking, which allows passengers to track the driver’s location and estimated arrival time.
- Efficient ride matching: The system uses a geospatial database to match ride requests with available drivers in the area, resulting in efficient and timely ride matching.
- Payment integration: The system integrates with various payment gateways, making it easy for passengers to pay for rides and for drivers to receive payments.
Enhanced user experience: The system provides a seamless user experience, allowing passengers to quickly book rides and drivers to easily accept or reject ride requests.
Disadvantages of the Uber system architecture:
- Complexity: The system architecture is complex and requires multiple components to work together seamlessly, making it challenging to develop and maintain.
- Dependency on third-party services: The system relies on various third-party services such as map services and payment gateways, which can result in potential downtime or issues with service providers.
- Security concerns: The system stores personal and financial data, which can be vulnerable to hacking or other security breaches if proper security measures are not in place.
- Dependence on internet connectivity: The system requires a reliable internet connection for users to use the service, which can be a disadvantage in areas with poor connectivity.
- Surge pricing: While surge pricing can be an advantage for drivers during peak demand times, it can be a disadvantage for passengers who may have to pay higher prices for their rides.
And that’s why, GeeksforGeeks is providing you with an in-depth interview-centric System Design – Live Course that will help you prepare for the questions related to System Designs for Google, Amazon, Adobe, Uber, and other product-based companies.
Hoje, às 15:34
Hoje, às 15:23
Hoje, às 15:01
Hoje, às 15:00
Hoje, às 14:29
Hoje, às 14:13
Hoje, às 14:00
Hoje, às 13:59
Hoje, às 13:51
AI | Techcrunch
Hoje, às 13:33