Analytics And Intelligent Systems

NUS ISS AIS Practice Group

Customer Churn Prediction in the Telecommunications Sector Using Rough Set Approach — June 5, 2017

Customer Churn Prediction in the Telecommunications Sector Using Rough Set Approach

This study aims to develop an improved customer churn prediction technique, as high customer churn rates have caused an increase in the cost of customer acquisition. This technique will be developed through identifying the most suitable rule extraction algorithm to extract practical rules from hidden patterns in the telecommunications sector.

Screen Shot 2017-06-05 at 7.02.06 PM

Submitted by:
Arun Kumar Balasubramanian
Devi Vijayakumar
Sunil Prakash
Gaelan Gu
Sambit Kumar Panigrahi

Spatio-Temporal Analysis of Students’ Travelling Behaviours — May 5, 2017

Spatio-Temporal Analysis of Students’ Travelling Behaviours

Due to occasional train faults and poor knowledge of the best routes, students often spend longer times travelling to school. Another issue is that students who are new to Singapore may not be aware of amenities which are available in their immediate vicinity. The objectives of this project are to help optimise travel times for students and provide helpful information about facilities near their home or travel routes.

Data Sources

We obtained our data from mainly these few sources:

  • Moves & OpenPaths (Consolidated Students’ Coordinate Data)
  • Twitter API (Tweets on Reports of Train Disruptions, from SMRT’s Official Account)
  • (Shopping Mall & Hawker Centre Data)

Exploratory Analysis

It helps to use certain tools to understand how students are travelling around Singapore. The tools which we will be using are heatmaps, cluster maps and Anselin’s Local Moran’s I for Local Spatial Autocorrelation.



From the heatmap, we are able to tell which parts of the island are heavily distributed with movement activity. One very visible hotspot is the area around NUS, at the southwest part of Singapore.

Cluster Maps


This cluster map provides us with the number of data points which are present on the map and cluster them together. As a result, we can clearly compare one hotspot with another easily. It is also possible to tell how many outliers we have on the map. To explore further, we can click on any cluster to zoom into individual points which are contributing to any given cluster.

Anselin’s Local Moran’s I for Outlier Analysis


Using Anselin’s Local Moran’s I Index to help us distinguish outliers, we can tell how long students typical spend at each location, as compared to their neighbours. Red points are high-high points, which tell us that these students spend a longer duration, as compared to their neighbours and that there is a high concentration of these students in this particular location. Blue points are low-low points and signify the contrary. We consider these points to be outliers on the map.


We now attempt to address our objectives in this project, by examining the impact of train faults on students’ travelling plans and also to propose useful information on nearby amenities.

Train Faults


By plotting the lines where the disruptions have occurred before, we can see which line is the most prone to train faults (at least in the past 2 years). The area around Bishan station on the North-South line is also quite prone to such incidents. We have also discovered that train disruptions usually occur on Mondays and Thursdays. Although it is not practical to avoid travelling on those days, we advise that students be prepared for an increased probability of incidents.

Shopping Malls


Using medoid clustering to determine the centres of the clusters of students’ movements, we created a 3km geofence to highlight which shopping malls are closest to these clusters. As shown in the above diagram, we identified 3 main clusters in Singapore – NUS, CBD and in the east-side. There are not many shopping malls shown in the clusters – this is understandable as these clusters are residential districts. There are customer ratings available for each shopping mall available in Singapore as well.

Hawker Centres


Hawker centres, on the other hand, tell a different story. There are an abundance and relatively equal distribution of hawker centres within the clusters. The government has made a conscientious effort to ensure that affordable food eateries are available within a walking distance, to all residing in these residential districts. Customer ratings for these hawker centres are available too.

Proposal of Alternative Travelling Route


We have attempted to propose an alternative travelling route in the event of a train disruption, from Raffles Place to NUS ISS. As this route planning algorithm is in its infancy, we have only proposed this one alternative, which is only possible via car/taxi.


In this project, we achieved our objectives of proposing alternative routes and helping students plan their travel routes better in the event of train disruptions, as well as suggesting helpful information on nearby amenities.

We hope you enjoy exploring these maps and discovering further helpful geospatial insights about Singapore.

The published map app may be viewed at this link.

Published by Team TRUMP [EBAC 04]

Sunil Prakash

Gaelan Gu

Ethiraj Srinivasan

Suma Mulpuru

Sindhu Rumesh Kumar

Leukemia Prediction Using Sparse Logistic Regression — April 29, 2017

Leukemia Prediction Using Sparse Logistic Regression

This paper aims to predict the diagnosis of Acute Myeloid Leukemia (AML) from flow cytometry data using more automated decision making systems. This dataset was obtained from the DREAM6 AML Prediction Challenge. As the dataset is very large (84 dimensions), further processing using Linear Discriminant Analysis had to be conducted to reduce the dimensionality for simpler analysis.

Finally, a sparse logistic regression model will be used to classify a patient as being AML-positive or -negative with estimated probabilities.

Screen Shot 2017-04-29 at 2.51.44 PM

Journal Reference:

Manninen T, Huttunen H, Ruusuvuori P, Nykter M (2013) Leukemia Prediction Using Sparse Logistic Regression. PLoS ONE 8(8): e72932. doi:10.1371/journal.pone.0072932

Team Trump [EBAC 04]
Gaelan Gu
Sunil Prakash
Yu Yue
Wang Ruoshi

Costly Singapore – Dashboard — March 16, 2017

Costly Singapore – Dashboard

Screen Shot 2017-03-17 at 8.21.03 PM.png

We created this dashboard with the aim of answering important questions about the rising cost of living in Singapore to understand the aspects which make it one of the most expensive cities to live in the world today. We will analyze this phenomenon and forecast the trend in the short-term. Key metrics will also be examined in further detail as well, in this dashboard. The questions which we will evaluate are as follows:

  1. Food prices have been increasing, but is it a considerable factor contributing to Singapore’s aggregate cost of living?

In order to determine if food prices will cause a large strain on Singaporeans’ income, we look at the proportion it has of the total monthly expenditure in 2015. However, we can see that housing rent has a much bigger slice of the pie at more than 40%.

  1. Healthcare costs have always been high, but is this due to a shortage of doctors in Singapore?

We examine the supply and demand of healthcare in the general sense – supply being the number of doctors we have per 1,000 residents and demand being the total resident population. The latter is definitely rising at a rate, that is quicker than our ‘supply’ of doctors. This might suggest that doctor’s fees might increase as well, to compensate for this shortfall.

  1. The housing market is becoming increasingly competitive, especially with Singapore’s perpetual issue of limited land space. What are some of the factors leading to this?

As Singapore’s land area has remained at around 700 square kilometers for the past decade, we are facing increasing housing prices due to a land crunch. This phenomenon can be evaluated in greater detail through the trend of the various housing indices over the years, such as HDB, private landed and private non-landed.

  1. How is our cost of living compared with other parts of the world, especially in Asia?

The cost of living is computed based on user inputs and price data consolidated from government sources, to generate the average expenses in a given city for a 4-person family. With New York City in the US kept as the benchmark (index = 100), we can see how Singapore ranks with it, as well as her other Asian neighbors and partners in the globe. 

  1. After analyzing the individual aspects of Singapore’s rising cost of living, how can we forecast this for the foreseeable future?

Looking at the housing and food price indices, general and healthcare CPI, they appear to be on the upward trend. As a result, the cost of living index will also maintain a similar trend for the next 1 to 2 years. We can perform a simple forecast for each of the components, to visualize this impact on Singaporeans.

Link to Dashboard on Tableau Public

Published by:

Sunil Prakash
Gaelan Gu
Ma Min
Wang Ruoshi
Yu Yue

[EBAC 04 – 2017]