Analytics And Intelligent Systems

NUS ISS AIS Practice Group

Spatial-Temporal Analytics with Students Data to recommend optimum regions to stay — May 7, 2017

Spatial-Temporal Analytics with Students Data to recommend optimum regions to stay

Objectives & Motivation:

Objective is to find a convenient place for students to stay based on data collected in Singapore

To any international student coming into Singapore needs robust information about convenient places to rent. To proceed with this we analyzed student’s movement data which was collected by students of NUS belonging to ISS school, exploratory spatial data analysis was done on that to find the pattern and insights. We considered that as a student basic amenities would be economical stay, cycling path, library, MRT closeness, parks, hawker center.

Data Sources:

Dataset description Data Source URL
Data points of all NUS-ISS  students IVLE (Apps used: Openpaths)
Dwelling data
Hawker Centres
Cycling Path
Park Connector Loop




OpenPath data with student’s personal location information was collected for the month of April. Data Cleaning was done to refine the data. As the data was stored on multiple mobile devices the date and time formats were inconsistent, thus, all the date and time field values were converted to a single standard format. To do more analysis, separate columns were created for date and time. The outliers were treated using R and the data points outside Singapore were ignored. Modified Name and MailID columns to obtain missing details using EXCEL.

Exploratory Analysis of Students Data (with derived fields):

Mean Centre:


From the above exploration, it is evident that most student stay at the university, travel to the residence and they use MRT most of the time. As students would prefer to stay near university, place near to MRT station and have the potential to use Cycling path, our aim is to find and suggest better zones for student life. Thus, we need to add these layers to the student data and carry out analysis.


To get more insights into a suitable dwelling, we tried to separately analyze various datasets such as HDB dwelling population data, Hawker center location, MRT station, and cycling paths.

Geographically Weighted Regression:

To find insights from student data using HDB and population data layer, geographically weighted regression is performed using variables as shown below with an assumption that data points having the timestamp of late night represent student’s home. Spatial join was done on hawker center, dwelling and open path student data layers to obtain final GWR model.

Explanatory Variables:

The count_  variable is the number of student data points per polygon, Count_1 is the number of Hawker centers per polygon, showsSHAPE_Area of the dwelling layer and HDB the total count of HDB per person.

GWR Model

The GWR results show that model can perform with moderate accuracy having adjusted R2 of 0.31.

Spatial Autocorrelation (Moran’s I) tool on the regression residuals was run to ensure that the model residuals are spatially random. Statistically significant clustering of high and/or low residuals (model under- and overprediction) indicates that our GWR model is not accurate enough to predict.

GWR Fail

The below map indicates regions with localized R square to find HDB population using students data. The Dark Red regions are the places where the model predicts with higher accuracy and the blue regions are the place where its prediction is poor.


As the model residuals are not random based on the spatial autocorrelation ( Morons I ), it cannot be used for prediction purposes. This model was just build to study the insights of students data with another layer of data.

Thus, to find convenient places for students, the places were ranked using student data points and factors like MRT, cycling path and hawker centers availability.


Ranking zones based on Hawker Centres:

To get a density of hawker centers in each we calculated a new field which was used for ranking:

Ranking of the classified field by using reclassify tool to rank the regions based on the hawker center density which shows places ranked according to economic zones for food.


Ranking zones Based on proximity to cycling paths:

The areas are ranked as per its proximity of cycling paths and data is converted to raster data and the ranked.


Ranking zones based on MRT by reclassify:

The easy access to public transport is considered one of the major consideration while choosing a place to stay and have used proximity to MRT as a factor to rank areas. MRT location data is converted to raster data to rank areas based on its proximity to MRT stations


 Final Ranking:

The final rank for each zone in Singapore was calculated based on the average of other three ranks (MRT, Hawker center, and Cycling path). Areas which are ranked good can be most favorable for staying. This final ranking can help to choose a place for staying based on individual priority.


 Final Rank




Using the final ranking, we can recommend to a new student coming to study at NUS a convenient place to stay, considering MRT, Cycling and Food places in the ranking.

As expected the better ranking zones are crowded near NUS itself and there are other places also being suggested by the ranking.

As a future scope of this story, we can add a configurable element which can replace Hawker center layer with many another layer like Libraries, Parks, HDB rental prices, Bus stop layer to form an ideal tool for the upcoming student to use it.

For detailed analysis, please read the entire report: – Spatial-Temporal Analytics with Students Data

Explored and submitted by: 





LI MEIYAO (A0163379U)




To find most accessible study areas for students in NUS

Problem Space:


Analysis Strategy:

  • What? – Availability of study areas for students
  • Where? – Inside NUS campus
  • Why? – The various reasons for the preferred locations
  • How? – Finding the clusters around various study locations

Data Exploration and Feature Addition:

  • Data source is OpenPaths
  • The class data was cleansed and the records pertaining to geographical coordinates of Singapore/NUS was chosen
  • The initial analysis of the sampled data was performed using various tools such as R , carto, ArcGIS to study the geographic spread
  •  A new dataset was formulated for representing the various study centres in NUS
  • Transformation was performed to achieve the variables in the necessary format for the geo-visualization
  • Reverse geocoding was performed on the dataset using the ‘ggmap’ package in R and the corresponding locations were obtained

Preliminary Analysis:


  • Carto was used to analyse the spread of the data points during the class hours and after the class hours
  • The results of the analysis portrayed that after the class hours the population spread is more at University town owing to availability of more study areas and facilities



Base Map Creation and Polygon Generation:


Addition of Layers:


Step 3: Loaded class data (master class namely) and converted coordinates for visibility

Density Analysis:



Step 6: The high-density area was in and around National University of Singapore. We can conclude that the data points are either working in NUS or students of NUS.

Assumption and Addition of new Layer:


Hot Spot Analysis:

Step 8: Hot Spot Analysis was performed, Arc Tool Box->Spatial Statistics Tools->Mapping Clusters->Optimized Hotspot Analysis


Model Diagram – Proximity Analysis:


Step 9: Proximity Analysis was performed to find most accessible study areas for students in NUS, Arc Tool Box->Analysis Tools->Proximity->Near

Proximity Analysis Results:


Inferences/Solution Outline:

  • Comparing the inference obtained from CARTO and Model built in ARCGIS, We can find that the students only focus on University Town
  • From the model it is evident that there were other study areas that could be preferred as the data points were close to these study areas
  • The students can explore other areas like FOE,SOC etc for holding discussion sessions
  • We can propose a new study area at an optimal location based on the geographic distribution of student data in case the number of students enrolled increases over a period of time


  • It’s a student’s location data
  • Sample size is limited to ISS students only
  • Non availability of accurate shuttle bus timings data
  • Non availability of students enrolled in each and every faculty

Team Name: Incognito

Team Members: Pankaj Mitra, Deepthi Suresh, Anand Rajan, Neeraja Lalitha Muralidharan, Nandini Paila Reddy

Delay Estimation in Pedestrian crossing — April 29, 2017

Delay Estimation in Pedestrian crossing

Team Name: Incognito

Team Members: Deepthi Suresh, Neeraja Lalitha Muralidharan, Pankaj Mitra, Sindhu Rumesh Kumar


This study uses multiple linear regression:

1.To provide theoretical support for traffic management and control

2.To increase efficiency at intersections and improve security

One Page Journal Summary “Estimates of pedestrian crossing delay based on multiple linear regression and application” authored by Li Dan and Xiaofa Shi.

Journal_Pedestrian Crossing

Click here for Journal.

Analysis on Workplace Injuries — March 17, 2017

Analysis on Workplace Injuries


The objective of this dashboard is to demonstrate different levels and types of injuries caused in the workplaces of Singapore.


Analysis on Workplace Injuries – Analytics And Intelligent Systems


  1. In which industry majority of injuries occur?

Majority of injuries occur in the Construction and Manufacturing industry consistently over the years but in the year 2016, the injuries pertaining to “Accommodation and Food Services” has also risen.

  1. With the current scenario in workplaces, which type of injury occurs more frequently?

The number of minor injuries (94.9%) greatly exceeds the number of major (4.5%) and fatal (0.5%).

  1. Is there a trend seen in the number of injuries caused in the industries through the years 2014-2016?

The fatal and major injuries are consistent through the years, while the number of minor injuries dropped in 2015 and increased drastically in 2016.

  1. What are the common types of minor injuries across different industries?

Injuries caused by slips, trips and falls are the maximum across industries.  Injuries caused by cuts or stabs by objects is more at the Accommodation and Food services industries, while workmen at Construction and Manufacturing industries are more affected by moving objects.

  1. Why has minor injury increased drastically from 2015 to 2016 while fatal and major injuries are consistent?

A good measure has been taken to avoid the fatal injuries yet minor injuries have not decreased. This could be because of the conversion of fatal injuries to minor injuries.


Final Analysis

We can clearly see that fatal and major injuries are very low and steps have also been taken to reduce this further, there is still a rise in the minor injuries. This may be because the fatal injuries have been converted to minor injuries.

Struck by moving/falling objects is one of the common cause of minor injury in Construction as well as Manufacturing industry, so to suggest a common solution for this, we could make use of Wireless sensors (heat and motion) on the objects which make an alert sound when it reaches within 50 m of any human being. This way the people working at the site will be aware of any moving objects in their vicinity and move out of harm.

The injuries may be minor, but if the injuries occur simultaneously to multiple people, it may affect the overall productivity of the company.


Tableau Public link:

Workplace Injuries in Singapore(2011-2016)

Submitted by:

Pankaj Mitra (A0163319E)

Deepthi Suresh (A0163328E)

Neeraja Laitha Muralidharan (A0163327H)

Kriti Srivastasa (A0163206N)

Sindhu Rumesh Kumar


Spatialite – Spatially Lighting You in A Dynamic Way (Spatio-Temporal Analytics) — June 13, 2017

Spatialite – Spatially Lighting You in A Dynamic Way (Spatio-Temporal Analytics)

Street lights in Singapore is valuable but expensive assets for the city. However, according to a recent study published by Science Advances (Jun 2016) – Singapore was named the country with the worst level of light pollution in the world with a pollution level of 100 per cent. The use of artificial light here far exceeds the level of light pollution tolerable per capita.

Today’s street lights are a lot to manage, and tend to function inefficiently by wasting energy when they are on. Hence, Remote Control Monitoring System (RCMS) was designed with energy savings as the goal. Although RCMS presents opportunities for saving energy cost, street lighting can be further optimize by taking into account, the trend of people who are outdoor at night. With the advantage of geospatial analytics, we would like to introduce: Street LightTime + Weather +Flow rate of Pedestrians



Geospatial Analysis Techniques Used  :

  1. Points to Line
  2. Buffering
  3. Overlaying
  4. Clipping

Geospatial Analysis Used:

  1. Heat Map
  2. Cluster and Outlier Analysis
  3. Hot Spot Analysis
  4. Directional Distribution Analysis (for movement data)

Summary of Findings:


  • Light up in the presence of a person or car, and remain dim the rest of the time.
  • Autonomous dimming when no movement detected.
  • Predict the movement pattern and light up ahead.
  • Consider the distance between each lamppost through geo analysis


  • Improve the quality of life by reducing artificial light
  • Reduce light pollution level
  • Significant energy saving

Welcome to have a full view of the ArcGIS Map Journal from here.

The presentation slide is also available from this link .

Presented to you by: Team GEOSPIES


Forecasting Customer Lifetime Value: A Statistical Approach — June 8, 2017

Forecasting Customer Lifetime Value: A Statistical Approach

The objective of the study is to predict the lifetime value of a customer with the customer database to quantify the customer’s worth to an organisation and come up with an appropriate CRM strategy to either retain the customer or invest on new customers.

Journal 1

Click here for journal

-Submitted By
Abhinaya M [A0163311W],
Allen Geoffrey Raj [A0163398R],
Aravind Somasundaram [A0163301X],
Preethi Jennifer [A0163190L],
Ram Nagarajan [A0163247E]


An American Improving Academic Customer Library Relations with Social Listening: A Case Study of an American Academic Library — June 7, 2017

An American Improving Academic Customer Library Relations with Social Listening: A Case Study of an American Academic Library

Strategic social media plays a crucial role in contemporary customer relationship management (CRM); however, the best practices for social CRM are still being discovered and established. The ever-changing nature of social media challenges the ability to establish benchmarks; nonetheless, this article captures and shares actions, insights, and experiences of using social media for CRM. This case study examines how an academic library at a mid-size American university located in northeast Florida uses social media to engage in social listening and to enhance CRM. In particular, the social listening practices of this library are highlighted in relation to how they influence and potentially improve CRM. By exploring the practices of this single institution, attempts are made to better understand how academic libraries engage with customers using social media as a CRM tool and ideas for future research in the realm of social media and CRM practices are discussed.
Academic Library, Customer Relationship Management, Facebook, Hashtags, Instagram, Library Customers, Social CRM, Social Media, Strategic Social Listening, Thomas G. Carpenter Library, Twitter



SUBMITTED BY:  Ding Renzhi A0163220X, Gu Zhuyi A0163219H

                                  Ma Min A0163305N, Gao Ruofei A0163436E

                                  Zheng Weiyu A0163412R

Managing Customer Loyalty through Acquisition, Retention and Experience Efforts: An Empirical Study on Service Consumers in India — June 5, 2017

Managing Customer Loyalty through Acquisition, Retention and Experience Efforts: An Empirical Study on Service Consumers in India

Recent developments in the marketing literature highlight the significance of consumer relationship management (CRM) in driving consumer loyalty (CL). In order to provide a clear understanding of the impact of CRM on CL, this study develops an integrated framework of CRM activities: Acquisition, retention, and experience to manage Customer Loyalty through direct and indirect approaches (with the mediation of satisfaction, trust, and commitment). The article utilizes a survey-based empirical study of 600 consumers from three service sectors (health, retail, and wellness). The findings of the study suggest that a firm that pays more attention to manage consumer experiences would be significantly benefited from the implementation of CRM programs. Consumer experience efforts have the positive impact on CL through commitment in all three sectors. Service manager should have clarity and consciousness that consumers are not looking for just traditional CRM benefits such as value proportion, reward points and so on but specifically seek for a pleasant experience of various touchpoints. Various frameworks of acquisition of CRM activities to manage Customer Loyalty has been analyzed.

Key Words:
Consumer loyalty, Consumer relationship management, Consumer experience management, Satisfaction, Trust, Commitment

Managing customer Loyalty

Source : Managing Consumer Loyalty through Acquisition, Retention and Experience Efforts: An Empirical Study on Service Consumers in India

Submitted by TEAM MARS

  1. Mutharasan Anbarasan(A0163257A)
  2. Raghavan Kalyanasundaram(A0163316L)
  3. Saravanan Kalastha Sekar(A0163309H)
  4. Seshan Sridharan(A0148476R)
  5. Sindhu Rumesh Kumar(A0163342M)
Customer Churn Prediction in the Telecommunications Sector Using Rough Set Approach —

Customer Churn Prediction in the Telecommunications Sector Using Rough Set Approach

This study aims to develop an improved customer churn prediction technique, as high customer churn rates have caused an increase in the cost of customer acquisition. This technique will be developed through identifying the most suitable rule extraction algorithm to extract practical rules from hidden patterns in the telecommunications sector.

Screen Shot 2017-06-05 at 7.02.06 PM

Submitted by:
Arun Kumar Balasubramanian
Devi Vijayakumar
Sunil Prakash
Gaelan Gu
Sambit Kumar Panigrahi

Recurrent Neural Networks for Customer Purchase Prediction on Twitter —

Recurrent Neural Networks for Customer Purchase Prediction on Twitter

Objective: To identify whether a user will buy a product based on their sequential tweets and to improve the prediction of customer purchase. It is also to eliminate the non-buyers based on tweets.


Source: Recurrent Neural Networks For Customer Purchase Prediction on Twitter

Submitted By: Ashok Kuruvilla Eapen, Abhilasha Kumari, Pranav Agarwal, Navneet Goswami and Rohit Pattnaik

This Study was done to understand the impact of culture to win back defected customers. This study was conducted on college age consumers in America and China. This segment of consumers are target consumers of many technology and personal services.
H1: Chinese customers, when compared to American customers, will be more influenced by WOW offer when deciding on switching back to original provider
H2: Chinese Customers will be more influenced by relative social capital when deciding on switching back
H3: Chinese Customers will be less influenced by their post-switching regret when decide to switch back



Submitted by:
Muni Ranjan<A0163382E>, Pradeep Kumar<A0163453H>, Anusuya Manickavasagam<A0163300Y>, Khine Zin Win<A0163222U>



Journal Summarization for Customer Relationship Management (EB5203) Assignment

In this study, a conceptual framework is postulated to mathematically evaluate and ascertain the hypothesised relationship that perceived value and interactivity has with customer dissatisfaction issues. Then, the relationship between customer satisfaction issues and loyalty and customer acquisition, will be tested to enhance customer satisfaction and loyalty.

Screenshot 2017-06-05 13.27.20

Team Members:-

Prashant Jain, Praveen Tiwari, Kavya AK, Praman Shukla

Analyzing the Effectiveness of Customer Retention Strategies with Existing Customers in Banking Industry — June 3, 2017

Analyzing the Effectiveness of Customer Retention Strategies with Existing Customers in Banking Industry

The rationale of this study is to find out the effectiveness of customer retention strategies from the perspective of the existing customers of a bank in Dehradun(India). The study indicated age, gender and income of existing customers are statistically insignificant in customer retention, and education background is statistically significant.

From the Factor Analysis, the existing customer retention strategies can be factorized in to three principal components: value added services, convenience and business development. The multiple regression analysis indicated the effectiveness of retention strategies is dependent on the three customer retention strategies’ factors derived from factor analysis.


Please click here for Journal.

Team members:
Low Kang Jiang | Narasimhan Balasubramanian | Nie Bixuan | Tan Hui Keng | Qin Si