Swift code to extract most common page sequences from an Apache server log

Posted on


So I received the following prompt for a code interview:

“Please create an iOS app that downloads a text file from our server, parses the
contents, and displays the results. The file is a standard Apache web server access log. We’re interested in knowing the most common three-page sequences in the file. A three-page sequence is three consecutive requests from the same user.

This is a sample log entry: – – [03/Sep/2013:18:35:38 -0600] “GET /products/phone/ HTTP/1.1” 500 821 “-” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.65 Safari/537.36”

My ViewController.swift is as follows:

//  ViewController.swift
//  InspriingAppsTest
//  Created by *** on 3/2/21.
import UIKit

class ViewController: UIViewController, UITableViewDelegate, UITableViewDataSource {
    @IBOutlet weak var loadingWindow: UIView!
    @IBOutlet weak var dataViewer: UITableView!
    func tableView(_ tableView: UITableView, numberOfRowsInSection section: Int) -> Int {
        print("Count is "+String(sequenceTally.count))
        return sequenceTally.count
    func tableView(_ tableView: UITableView, cellForRowAt indexPath: IndexPath) -> UITableViewCell {
        let currentCell = dataViewer.dequeueReusableCell(withIdentifier: "cellTemplate") as! PageOccurenceCell
        currentCell.pageOne.text = sequenceTally.sorted(by: {$0.value > $1.value})[indexPath.row].key.split(separator: " ")[0].replacingOccurrences(of: " ", with: "")
        currentCell.pageTwo.text = sequenceTally.sorted(by: {$0.value > $1.value})[indexPath.row].key.split(separator: " ")[1].replacingOccurrences(of: " ", with: "")
        currentCell.pageThree.text = sequenceTally.sorted(by: {$0.value > $1.value})[indexPath.row].key.split(separator: " ")[2].replacingOccurrences(of: " ", with: "")
        currentCell.countLabel.text = String(sequenceTally.sorted(by: {$0.value > $1.value})[indexPath.row].value)
        return currentCell

    var sequenceTally: [String:Int] = [:]
    var sortedNumbers: [String:Int] = [:]
    var userCount: [String:[String]]=[:]
    //var counter: Int = 0
    func processResponse(responseString: String)
        let lines = responseString.components(separatedBy: "n")
        let ipStartIndex = lines[0].index(lines[0].startIndex,offsetBy: 0)
        let ipEndIndex = lines[0].index(lines[0].startIndex,offsetBy: 9)
        let pageStartIndex = lines[0].index(lines[0].startIndex,offsetBy: 48)
        let ipRange: Range = ipStartIndex..<ipEndIndex
        for line in lines
            guard let hIndex = line.firstIndex(of: "H") else{return }
            let pageRange = pageStartIndex..<hIndex
            if(userCount[String(line[ipRange])] == nil)
                userCount[String(line[ipRange])] = []
            if(userCount[String(line[ipRange])]?.count == 3)
                var stringKey: String = ""
                for items in userCount[String(line[ipRange])]!
                    stringKey.append(items + " ")
                if(sequenceTally[stringKey] == nil)
                    sequenceTally[stringKey] = 0
                sequenceTally[stringKey]! += 1
                userCount[String(line[ipRange])] = []
            userCount[String(line[ipRange])]?.append(String(line[pageRange]).replacingOccurrences(of: " ", with: ""))
    override func viewDidLoad() {
        let url = URL(string: "https://logfilehere")!
        URLSession.shared.dataTask(with: url) { (data, response, error) in
            self.processResponse(responseString: String(data: data!, encoding: String.Encoding.utf8)!)
            DispatchQueue.main.async {
                self.dataViewer.delegate = self
                self.dataViewer.dataSource = self
                self.loadingWindow.isHidden = true



ProcessResponse is in the ViewController – it shouldn’t be. It’s a model responsibility.

It’s signature is String -> Void (it returns no values) but it mutates values that it closes over: sequenceTally, userCount. That’s not a good idea: It’s not testable, because you can’t feed the function a known response and assert that the values you get back are what you expect.

Its name is not named following the “Swift API design guidelines”.
ViewController is not a descriptive name for a specific ViewController.

The parsing of each line into an ip address is brittle, because the index offset doesn’t cope well with different length ip addresses. Looking for an “H” or stopping the entire parse early is brittle, too.

It doesn’t parse the line into a suitable data structure, instead you have dictionaries of Strings, Int’s etc.

Making a data structure(s) (e.g. struct User, struct PageRequest) to fit your data means processing the data will become simpler. For example when populating a table view cell, you shouldn’t have to be futher processing your data like .key.split(separator: " ")[1].replacingOccurrences(of: " ", with: "") – you should have something readable like cell.title = threePageSequence.user.name

The processing algorithm is incorrect – take ABCDBCDBCD, the most common triple sequence in that is BCD, but you chunk it into ABC, DBC, DBC and you would say DBC is the most common triple subsequence.

Your code repeats itself a lot, e.g. String(line[ipRange]) – and in places performance is poor because of it. For example in CellForRowAtIndexPath you sort sequenceTally.sorted(by: {$0.value > $1.value} multiple times, and that’s called for each row in the tableView. Instead you could sort an array once, and use that sorted array as the table view’s data source.

Using just the ip address to identify a user is questionable too, the combination of browser and ip address might be better.

Leave a Reply

Your email address will not be published. Required fields are marked *