This is an HTML version of an attachment to the Official Information request 'Communication regarding Population Density programme'.



the  Act
under 
Population density pilot
Information 
Released 
Official 



the  
Together we will prove:
Act

the details of the commercial viability of a 
population density product and the data

the quality of population density data inferred 
from location estimates and Stats NZ population 
under 
The pilot
expertise 

the high value use cases 
We recommend up to a 6 month pilot to prove the 
initial model, followed by further iterations of the 
product and product roadmap informed by the 
Information 
learnings.
Released 
Official 



the  Act
We will have worked with you to create a business case that 
adds value to your organisation.
It will have answered what is the value of this product.
What the 
under 
For government, it’s about making better decisions for 
pilot will 
Aotearoa NZ.
answer?
For others, it’s about learning about their community and 
optimising and growing revenue.
For all, it’s about creating clear public benefit in doing this.
Information 
Released 
Official 



the  Act
Over the last three months 12 government organisations 
Who has 
have provided validation and testing towards the 
under 
been 
product being provided in this pilot.
involved 
This pilot is the next step in developing a viable product 
offering that will deliver value to the customers in that 
so far?
group.
Information 
Released 
Official 



the  Act

Access to data at a level of frequency and 
resolution that is better than anything available on 
the market
What are 
under 

The expertise of Stats NZ that improves the quality 
the benefits 
of the data so you don’t have to.
of the pilot?

Any privacy and confidentiality risks are managed 
by Stats NZ expertise and reviewed by the Office of 
the Privacy Commissioner
Information 

Simplifying the government procurement process 
Released 
Official 



the  Act
We will agree on success criteria for a pilot, and we 
expect them to be themed around:
What does 
under 

You develop at least one valuable workflow to 
success 
incorporate the data provided by us in the pilot
look like?

We reduce your costs relating to the acquisition 
and processing/handling of data
Information 
Released 
Official 



the  

People from your legal, data/insights, financial and executive 
Act
areas that will be appropriate for weekly progress meetings, 
on-going agreement discussions, and authority to make 
appropriate decisions. We are expecting this will require up to 
40 hours a month of time across these areas of your 
What is the 
organisation.
under 
commitment  ● A workshop with relevant parties to understand the use cases 
to build success criteria.
from you?

A commitment to work to build a business case to secure 
funding for the 18/19 financial year if value is identified.
Information 

A small upfront contribution to help cover some of the data 
processing costs?
Released 
Official 



the  Act
What to 
Within 6 months we will have made a seamless 
under 
expect at 
transition from testing viability of the product to a live 
product in use by your organisation.
the end?
Information 
Released 
Official 



the  Act
Next steps
under 
Signing an MOU to go ahead with the pilot
and confirm the commitment of resources needed.
Information 
Released 
Official 




the  Act
under 
[email address]     @dataventuresnz   
https://medium.com/data-ventures
Information 
Released 
Official 

Data Ventures pilot plan 
 
Data Ventures is the commercial arm of Stats NZ. We are engaging with interested data 
providers on opportunities to increase the value of their data and their supporting services and 
functions by collaborating with Stats NZ and it’s unique position in the data ecosystem.  
 
More specifically, Data Ventures adds value throughout your data product and service pipeline 
and the content below will describe how we would collaborate with you to achieve this value 
add. 
 
the  
Evaluation of your data 
Data Ventures will work with you to understand your data and to 
 
provide an evaluation of your data. This include audits on the 
Act
quality of the data and understanding the variances, and what 
actions could be done to improve quality.  
Validating use cases 
As a government organisation, Data Ventures has the trust to test 
and validate the use cases with government agencies and crown 
owned entities. We work to understand their end to end 
processes and how the data can be used to support their goals. 
under 
Creating a suitable 
As use cases are validated and specific needs are identified, we 
offering to customers 
work with you to identify where there are opportunities to create a 
suitable offering. This may include applying specific models from 
Stats NZ expertise to the data, or combining the data with other 
data sources to improve the quality and fit of purpose. 
Managing risks 
Data Ventures will manage all the associated risk with the 
product. This includes the privacy and confidentiality risk of the 
data, and any external prespection risk. This is done through 
statistical methodology expertise and data management expertise 
within Stats NZ. 
Information 
Product to market 
Data Venture’s being a government agency reduces the burden 
and overheads associated with procurement with the 
government. We also have better access to government 
customers and their data needs/use cases due to the insights 
Released 
from other Stats NZ product offerings.  
Ongoing product 
Data ventures will provide ongoing management of the products 
management 
and act as a feedback source and lead generator for additional 
products and services that fall outside the scope of Data 
Ventures. 
Official 
Market analysis 
Data Ventures will supply the value-added dataset back to you for 
your own insights and analysis e.g. understanding your market 
share. 

Population Density 
Version 1 Product Information 
 
 
 
the  
 
Act
 
 
 
 
under 
 
 
 
 
 
 
Information 
 
 
 
Released 
 
 
 
 Official 
 
 
V0.3 - MVP Release for Pilot 

1.  Contents 
 
Contents 

Version Control 

Product Team 

Introduction 

the  
Target Customers 

Act
Customer Use Cases 

Customer Benefits 

Features 

Privacy Impact Assessment 

Population Density From Mobile Data 

under 
Population Density stored in a database 

API Connection 

Web Interface 
Error! Bookmark not defined. 
Customer Interface 

Searchable entities 

Technical Environment 

Information 
Product Roadmap 

Known Issues 

 
Released 
 
 
 
Official 
V0.3 - MVP Release for Pilot 

2.  Version Control 
 
Date 
Details 
Issued By  Version 
19 Oct 18  Initial strategy draft for internal approval 
JM 
0.1 
7 Dec 18 
Pre-Release Team Revision 
JM 
0.2 
15 Feb 19  MVP Release for Pilot 
JM 
0.3 
the  
 May 19 
Population Density - Product Release 
JM 
1.0 
Act
 
 
 
 
 
 
 
 
 
 
 
 
 
under 
3.  Product Team 
 
Jamie Marshall 
Product Owner 
Robert Chiu 
Business Development 
Drew Broadley 
Executive Director 
Information 
Holly He 
Operations 
 
4.  Introduction 
Released 
Population Density is the first product from Data Ventures, where Stats NZ public data and 
methodologies meet data from private companies, to make population insights which wil  
allow customers to make informed decisions based on recent population data. 
In this first iteration, the product will allow customers to query the population density 
Official 
database and provide aggregated data to customers at the National, Regional and Territorial 
Local Authority levels through a web-hosted Application Programming Interface (API).  Those 
V0.3 - MVP Release for Pilot 

customers that are unable to utilise the API will be able to use a web interface to query the 
database then download the results through a web portal. 
5.  Target Customers 
Population Density V1 wil  be available only to Government Departments and Crown Entities, 
including State-Owned Enterprises, Crown Research Institutes, Crown Financial Institutions, 
Crown-owned entity companies and other Crown entity companies. 
the  
6.  Customer Use Cases 
Act
A number of government agencies have told us that there are problems they are trying to 
solve where the data just isn’t available to help them answer questions to solve problems 
effectively.  
Some of the use cases that the agencies may have could be: 
●  What is the population change in a region from a Rugby test match? 
under 
●  Is the infrastructure sufficient for peak population demands and how can we 
distribute the load? 
●  Where do we put services after an emergency event? 
●  Where is a good place to put a new school? 
●  Is this suburb growing or shrinking? 
●  Where are the summer and winter hotspots and the impact of population on our 
national parks? 
The population data from the queries are then able to be used on its own or added to the 
crown agencies own insights and data to assist in decision making. 
Information 
7.  Customer Benefits 
The Product Density V1 product will allow customers to have access to data that will allow 
Released 
them to make the best population-based insights for their department or business.  This will 
deliver the following benefits: 
●  Ability to search by day of the week allows identification of the effects of weekend 
travel on population density to help determine the effect that population density 
might have on residential areas or recreational facilities during non-work days. 
Official 
●  Ability to search by hour of the day allows identification of the effects of commuting 
on population density by seeing population density changes in a region during 
working hours. 
V0.3 - MVP Release for Pilot 

●  Ability to search by date allows the identification of the effects of specific events in the 
area of interest on population density, such as public holidays, sporting and 
entertainment events. 
●  Having the data sources updated regularly ensures that customers are making their 
decisions with the most current information as population density may change over 
time from population base information, such as the census. 
●  Having the data sources updated regularly allows customers to identify population 
density trends overtime to ensure that infrastructure is being delivered to areas that 
may require investment due to increasing population. 
the  
●  Having the data sources updated regularly allows customers to isolate seasonal 
changes on population density al owing customers to make business decisions 
Act
around seasonal variations in population due to cropping, seasonal work and tourist 
industry. 
●  Having the data updated regularly allows customers to respond to any time-sensitive 
problems 
8.  Features 
under 
The features associated with Population Density V1 are created to allow ease of integration 
into customers existing geospatial technologies through the development of a web-hosted 
API.  Those customers without geospatial technologies need to be able to acquire and 
visualise the data through a graphical user interface (GUI).  
8.1.  Privacy Impact Assessment 
The privacy impact assessment was undertaken by an independent third party, Info by 
Design.  The key conclusion point from the Assessment is: 
Information 
“There are no identified privacy risks to the proposed Population Density product. Data 
Ventures is not collecting or using personal information as it is defined in the Privacy Act 
and this analysis of Data Ventures’ proposed processes show that it is following best 
practice information management and the OPC’s data and analytics principles.” 
Released 
The Office reviewed the Privacy Impact Assessment and a corresponding application 
for the Privacy Trust Mark. The office stated that: 
“Population Density, therefore, is complying with its obligations under the Privacy Act by 
not collecting personal information where it is unnecessary” 
Official 
V0.3 - MVP Release for Pilot 

The Privacy Trust Mark was not granted as the Population Density product does not 
col ect any personal information or allow individuals to re-identified through the 
product or contributing data. 
8.2.  Population Density From Mobile Data 
The aggregated data from the individual telcos are stored in separate bins in the AWS 
environment.  These datasets are then aggregated further into a single dataset where 
Stats NZ statistical methods and demographics will take the aggregated location 
the  
information and make this an indicator of the population in a geographical area. 
The population density wil  be a count of indicated population per geographic area.  Act
The population counts will be by hourly slices, days of the week and dates.  Over 
longer periods of time, hourly resolution of population estimates may be too granular, 
so summarised results at a daily, weekly, monthly or seasonal resolution will be made 
available to customers. 
 
under 
8.3.  Population Density stored in a database 
The aggregated Population Density will be stored in a database in a secure cloud-
hosted environment. The database format is to be determined by the developer but is 
expected to be a natively hosted by the web hosting service. 
8.4.  API Connection 
The predominant connection to the database is to be by way of an API.  The API wil  
detail all of the supported queries and outputs from the database. 
Information 
The API is to be documented and made available publicly as part of the sales and 
marketing package to allow customers to fully evaluate the product before purchase. 
The language of the API is GraphQL. Details of the API can be found here: 
Released 
https://graphql.org/ 
8.4.1.  API Documentation/Schema 
The GraphQL is a self-documenting API language 
Official 
V0.3 - MVP Release for Pilot 

8.5.  Web Interface 
8.5.1.  Customer Interface 
As there are a number of customers without the ability to implement API 
based interface to the database there is a requirement for a customer 
accessible web-based graphical user interface (GUI). 
This GUI shall be accessible from the web browsers that are currently 
supported by the customers.  Due to the environments being implemented by 
the  
customer information technology departments, these may not be the latest 
version of web browsers.  
Act
The GUI shall include user and company authentication. 
The GUI will present a graphical query/filter interface that allows the user to: 
●  Select the searchable entities to be used in the query 
●  Free searches are not available  
●  Return a table with the population count by area of interest 
under 
●  The results can be downloaded as a comma-separated text file (CSV).  
●  The results shall be in accordance with the customer data dictionary 
8.6.  Searchable entities 
The following searchable parameters will be available through the API and Web 
Interface. 
●  Search by date and time 
Information 
Web Interface: Customers will be able to search the hour of the day and the 
date in NZDT or NZST.  This will have potential risks because this is different 
from the API standard but the Web Interface and the outputs are designed to 
be human consumable.  This means that there could be some inconsistencies 
Released 
on the days where daylight savings starts or ends. The table wil  return the 
date and time in either NZDT OR NZST, depending on the date. 
API: API will accept and return the UTC DATE AND TIME 
●  Filter Results by location 
Official 
Web Interface: Results can be filtered by either Territorial Local Authority or 
Region.  If either a TLA or Region is selected the table wil  only contain those 
area units within the selected boundary. 
V0.3 - MVP Release for Pilot 

API: API will accept TLA or regional filters and return results from area units 
within the selected boundary. 
8.7.  Technical Environment 
The product wil  be hosted on the Amazon Web Service, including the database, 
administration and interfaces. 
Security and privacy of the data, user administration is to be factored into the 
deployment of the product. 
the  Act
9.  Product Roadmap 
Population Density V1 wil  be improved upon by adding additional datasets in the future to 
increase the certainty of the confidence in the indications of population density. 
Population Density V1 and subsequent versions will also be a base dataset for future products, 
such as identifying travel patterns. 
under 
10.  Known Issues 
No known issues with the product at this time. 
 
Information 
Released 
Official 
V0.3 - MVP Release for Pilot 


8/12/2019
population_density_mobile_data_definition_v3 - Google Docs
Pilot: Mobile Location Data 
Definition  
 
 
What data do we need for the pilot using 13 months of national data? 
 
We need two sets of data: 
the  
 
Act
1. Device Counts: A per hour, per cell towers , device count across a day of 13 months from , 
broken down to hourly. 
2. Cell Coverage Data (and historic covering the 13 months): data for telecommunications 
companies coverage including cell tower locations across all frequency bands for New 
Zealand. This would usually be three layers, one for 2G, 3G and 4G, a normalised layer of all 
three is fine too. 
 
Specifications for Device Counts: 
under 
 
Area: 
National 
Time period:  Hourly intervals from 12:00am 01/02/2019 through to 11:59pm 28/02/2019 
Data:
Date/Time (hourly), Cell Tower ID, Cell Join (Coverage), MSISDNS (counts of device 
numbers per sector/coverage area) 
 
Note: We have used the column names provided as per the calibration data. 
 
Definition of a “count” is based on the first activity a device has in the hour period by cell tower/sector. 
Information 
The specifications are based on the output from the Feb 1st calibration data. 
 
The Data will contain the following fields: 
 

Date-Time  
Cell Name  
Cell Join 
Missions 
Released 
(hourly intervals) 
(Cell Tower 
(Mobile coverage 
(Mobile device 
name)site 
area ID)Number of  number count) 
ID/coverage look 
mobile devices 
up ID Statistical 
(count) 
Area 2 (suburb) 
 
Official 
This can be provided as a CSV, or any other format that works for you. 
 
https://docs.google.com/document/d/15CJy9Z3phC1Ioh1DAsAPRgigno5AXas9kPCGe4nLeVI/edit#heading=h.bwyt0bvsh5h
1/2

8/12/2019
population_density_mobile_data_definition_v3 - Google Docs
Specifications for Cell Coverage Data: 
 
Furthermore, the geographical boundary of each cell towers coverage, per band if easier, so that we 
can spatially relate the coverage to statistical boundaries.  
 
These can be in any supported format but should contain a reference to the cell tower id to attach the 
device counts and date/time to. Preferences are for ESRI compatible SHP files with associated DB 
fields for each coverage area. 
 
What happens during the pilot? 
the  
 
We develop a model that combines all telco data sets into a single view of population across 13 
Act
months. 
 
As part of that we also convert device numbers to population counts using the Stats NZ IP around 
surveys and other respondent data in different areas of NZ. 
 
Through this we may have some questions around the data. 
 
under 
This is the first effort to get a baseline count, to which next we can work with you to define Local, 
Domestic and International profiles of devices for ours and your use. 
 
What data is required beyond the 13 months for the pilot? 
 
On the successful completion of this phase of the pilots’ data,  we are looking into how we could 
segment data down to international, domestic and local population. We will discuss this as another 
phase of the pilot, as we first need industry parties (including the Office of the Privacy Commissioner) 
involved to workshop definitions of what is local, domestic and international population.  
Information 
 
This will happen before we understand what is required around any data. 
 
Questions? 
  Released 
Contact Robert Chiu @ Data Ventures - [email address] 
Official 
https://docs.google.com/document/d/15CJy9Z3phC1Ioh1DAsAPRgigno5AXas9kPCGe4nLeVI/edit#heading=h.bwyt0bvsh5h
2/2

Population Density Data Dictionary
Date
2019-03-05
Version 
1.10
Name
Type
Description
the  
Date/Time From YYYY-MM-DD HH:MM:SS
This specifies the date that the data attributes from (start of the range). Ranges can be as small as one hour, and as large as 365 days.
Date/Time To
YYYY-MM-DD HH:MM:SS
This specifies the date that the data attributes to (end of the range). Ranges can be as small as one hour, and as large as 365 days.
Act
Statistical Area 2 String
This specifies the geographical boundaries the data represents e.g. [Penrose] or [Ponsonby East]. We use the official Stats NZ definition of 
an Statistical Area 2, which is as follows: Statistical Area 2s are aggregations of meshblocks. They are non–administrative areas that are in 
between meshblocks and territorial authorities in size. Statistical Area 2s must either define or aggregate to define, regional councils, 
territorial authorities and urban areas. 
Count
Integer
This represents the total number of people our models have calculated to be within the attributed time and place e.g. 9,632.
Notes
under 
To anticipate a likely question about people being in more than one location at once, our models calculated population counts based on aggregated data of the first unique counts within the time period. In other words, if there was 2 data points, one at 1:15 and one at 1:45, only the 1:15 data would be used when aggregating the counts up to the hour.
Future data dictionary might be expanded to other common date filters e.g. day of the week (Tuesdays), week of the year (Week 14), seasons, months, 
and years.
We will look to add other geographical boundaries such as regional councils, territorial authorities, Statistical Areas 1 and 2 in the future.
For a visual representation here is the New Zealand Statistical Area 2s visualised on a map: https://datafinder.stats.govt.nz/layer/92212-statistical-area-2-
2018-generalised/
Delimiters - NOT space : / or -
File Formats - I'm guessing CSV?
Information 
Released 
Official 

Document Outline