• InstaByte
  • Posts
  • πŸ’™ Meta and Microsoft love this

πŸ’™ Meta and Microsoft love this

ALSO: How to design Web Crawlers

Welcome back, Interview Masters!

Today’s coding challenge is a Meta and Microsoft favorite. Can you β€œsort” this out? 😁

In today’s newsletter, we’ll cover:

  • Merge Intervals

  • How to design a Web Crawler?

Read time: under 4 minutes

CODING CHALLENGE

Merge Intervals

Given an array of intervals where intervals[i] = [starti, endi], merge all overlapping intervals, and return an array of the non-overlapping intervals that cover all the intervals in the input.

Example 1:

Input: intervals = [[1,3],[2,6],[8,10],[15,18]]
Output: [[1,6],[8,10],[15,18]]
Explanation: Since intervals [1,3] and [2,6] overlap, merge them into [1,6].

Example 2:

Input: intervals = [[1,4],[4,5]]
Output: [[1,5]]
Explanation: Intervals [1,4] and [4,5] are considered overlapping.

Solve the problem here before reading the solution.

SOLUTION

To solve this problem, we'll first sort the intervals based on their start times. This makes it easier to merge overlapping intervals.

We'll then iterate through the sorted intervals. For each interval, we'll compare it with the previous interval in our result list. If they overlap, we'll merge them by updating the end time of the previous interval. If they don't overlap, we'll add the current interval to our result list.

The time complexity of this solution is O(nlog n) due to the sorting step. The subsequent iteration through the intervals is O(n), but the sorting dominates the time complexity.

SYSTEM DESIGN EXPLAINED

How to design a Web Crawler?

A web crawler is a system for downloading, storing, and analyzing web pages. It collects web pages by following links to gather new content. Web crawlers are critical components of search engines like Google.

A web crawler system has several key components:

  1. Seed URLs to start the crawling process

  2. URL Frontier to manage the queue of URLs to crawl

  3. HTML Fetcher to download web pages

  4. DNS Resolver to translate URLs to IP addresses

  5. HTML Parser to analyze page content

  6. Duplicate Detection component to avoid storing redundant pages

  7. Data Storage to persist crawled pages

The URL Frontier uses breadth-first traversal to prioritize URL crawling order. The HTML Fetcher retrieves web pages and the HTML Parser checks for issues like malformed HTML. Duplicate Detection uses hashing to identify duplicate content.

The system also has a URL Extractor to parse new links from crawled pages, a URL Filter to exclude unwanted links and a Visited URL Detector to skip already visited URLs.

You can dive into more details here.

NEWS

This week in the tech world

Source: Gerald Mark Soto

Waymo Opens Robotaxi Service to All SF: Alphabet-owned Waymo has expanded its self-driving taxi service to all users in San Francisco. The company has logged 3.8 million rider-only miles in the city and operates in limited capacity in other locations like Los Angeles and Austin.

Anthropic Unveils New AI Model: OpenAI competitor Anthropic has launched Claude 3.5 Sonnet, its most powerful AI model yet. The new model is said to be faster than its predecessor and shows improvements in understanding nuance, humor, and complex instructions.

Tesla Cuts 14% of Workforce: Internal data suggests Tesla has reduced its global workforce to just over 121,000 people, including temporary workers. This represents a cut of more than 14% since the end of 2023, as part of a broader restructuring effort announced by CEO Elon Musk.

Amazon Plans Budget Storefront: Amazon is launching a new storefront for low-cost items shipped directly from China to U.S. consumers. This move aims to counter competition from e-commerce upstarts Temu and Shein. The storefront will feature unbranded items under $20, with delivery times of 9-11 days.

Google Tests Facial Recognition on Campus: Google is piloting facial recognition technology for office security at a site near Seattle. The system compares camera footage to employee badge images to identify unauthorized individuals. Google says the data is not stored and employees can opt out of having their ID images used.

Harvard Dropouts Challenge Nvidia: A startup called Etched, founded by Harvard dropouts, has raised $120 million to develop an AI chip called Sohu. The chip aims to compete with Nvidia in the rapidly growing AI market by focusing on customized, hard-wired chips for specific AI models.

Amazon Sets Prime Day Dates: Amazon has announced that its annual Prime Day mega sale event will take place on July 16 and 17. The two-day discount event will feature "millions" of deals for Prime members.

BONUS

Just for laughs πŸ˜

REFER FOR THE WIN

πŸ‘‹ Hey! Share your referral link with friends to unlock Hidden Treasures:

πŸ“Œ Share your referral link on LinkedIn or with your friends to unlock the treasures quicker!
πŸ“Œ Check your referrals status here.

YOUR FEEDBACK

What did you think of this week's email?

Your feedback helps us create better emails for you!

Login or Subscribe to participate in polls.

Until next time, take care! πŸš€

Cheers,