Feb 17, 2015 01:23 AM

Guizhou's Wang Jiangping on the Big Data Buzz

Guizhou Province has set its sights on tapping the global shift to cloud computing by encouraging new businesses in the big data storage, data mining and processing sector.

The provincial government is one of several in China pushing for big data businesses as an attractive economic growth option, and one that offsets an ongoing slowdown for the nation's heavy industry and manufacturing sectors.

A key leader of the Guizhou initiative is the province's Deputy Governor Wang Jiangping, 50, who took office two years ago after more than 20 years in the business world. Wang is optimistic about the commercial potential for big data technology, and thinks governments in China should cultivate supportive business environments to encourage companies in this field.

Wang recently sat down with Caixin to discuss his provincial government's ambitious effort to break new ground in the big data arena, as well as a unique project in Guizhou aimed at building a regional big data center.

As Wang sees it, Guizhou is experimenting with a business model that, if all goes well, would transform this landlocked, relatively underdeveloped province into a buzzing hub for high-tech businesses. Excerpts from the interview follow.

Caixin: Big Data is a high-tech industry, but Guizhou is far below national averages in terms of economic and technological development. Why is the province developing this industry now?

Wang Jiangping: At the moment, China's industrial development is being constrained by limits on resources and environmental protection issues. This has forced us to carefully consider what kind of industrial base we need for the future development of the Guizhou economy. After repeated discussions, we eventually decided to target five new industries, with information technology at the core.

Big data facilities require sufficient electricity supplies, a moderate climate and geological stability. Guizhou has all of these. Guizhou also has another key advantage: Because Guizhou is relatively undeveloped, it is an inexpensive place to integrate fragmented and isolated databases, which today are among the requirements for developing a big data industry. Regions where IT sectors are already well-developed may face greater obstacles in meeting these requirements.

Because the central government has rolled out favorable policies for IT development, consensus has been growing among Guizhou officials – down to the grassroots level – to cultivate a big data industry. Accordingly, we have worked out plans and offered support, such as a financing platform for the industry and incentives for start-ups.

How much money has the provincial government spent on basic infrastructure?

We've spent approximately 60 million yuan on server procurement and budgeted another 9 million yuan a year for leasing mainframes and broadband (Internet) access. Alibaba Group Inc. helped us put them together free of charge and once our big data platform is up and running, we can split the profits with Alibaba.

Government agencies, such as health and social security bureaus, might be reluctant to transfer their databases to the Guizhou cloud. How will the provincial government encourage them?

We should be brave enough to "stand in the field of death and fight to live." We've used our mandate to encourage them to do so. In other words, we will no longer allocate money from the provincial government budget to buy conventional IT infrastructures for these departments.

For example, seven government departments overseeing public transportation, tourism, food safety and environmental protection used to buy their own IT systems and build their own mainframes, and replace them with new ones every couple of years with government money. From now on, they are required to join the big data program.... They must buy services from the program operator and move their stuff. If they refuse, they'll have to cope with their old systems, which won't last.

Has transferring to the cloud been a smooth process?

We are all very satisfied with the progress to date. We now have 41 departments that are transferring their entire IT systems onto the cloud. Some 26 are now fully functioning, and the other 15 are undergoing tests.

Seven government departments have had their IT systems transferred to the Guizhou cloud to facilitate government procurement services. Perhaps this is not a "big data industry." What else can Guizhou do to build an industry?

The government has formed a business entity to manage the cloud system unit for each of the seven departments. For example, Guizhou Food Safety Cloud Information Co. Ltd. is responsible for the part of cloud system for the food safety department. This company can be authorized by the (relevant) government department to access to databases for various applications. For instance, via the cloud unit under the tourism department, developers can build a platform that gives tourists information on scenic spots and helps them plan itineraries. They can also take advantage of the cloud system by cooperating with other private companies in developing new businesses such as value-added taxi services.

Selecting and authorizing a firm is very important. The process can help a government determine where to draw the line in terms of opening up government databases. The process of selecting and authorizing top-tier companies is open to all companies through open tenders. These are not necessarily state-owned companies.

What has been gained so far from the big data project?

The first step was to establish the basic infrastructure. Building the data center has attracted a lot of investment. In 2014, we built mainframe facilities covering 200,000 square meters of combined floor space, and nearly 10,000 servers were put into use. We will add another 100,000 servers in 2015.

The bandwidth for data outbound from Guizhou has also increased exponentially from a mere 1,000 gigabytes in early 2014, to 2,000 gigabytes at the end of the year.

Second, since the Guizhou Cloud platform's inception, we've seen some revenue from government cloud service procurements. This will definitely generate more income as government departments start purchasing services. The seven companies operating under the Guizhou cloud system will spawn a raft of new businesses that are expected to generate rapid growth in the future. Of course, output in the first year will not be very significant.

What special challenges have you encountered?

A high degree of planning is needed before government data is released. The first step is for the government to build a regulatory framework. The next step is to enact legislation. All government departments benefit, although some will release their data willingly and others will not. Therefore, through regulations, data transparency should be made a standard practice for everyone.

Because there's no precedent, we'll experiment as we go along. This is a process that involves continuous discovery. Initially, we plan to go for things we can easily handle and move on to those that are more difficult. We started at the municipal level and moved up to the provincial level.

The city of Guiyang has already written a plan for setting guidelines over the release of government databases to the public. The next step is to classify data and create a catalogue for data marked for public access. Already, Guiyang has started soliciting opinions from government departments over this work.

It will up to them to decide how data in each sector to classify. Our principle is that departments ought to classify the information they store, and those that get access to this data will be held responsible. In other words, relevant departments with data should follow classifications and ratings, while those applying for access should be responsible for integrity and safety.

How do you decide the extent to which government data can be accessed? What data do you think can be made public? What should be kept confidential?

That decision will be made by each department, according to the industry it's regulating. In general, every industry's data can be divided into classified data, sensitive data and non-sensitive data. We are now disclosing non-sensitive data.

For example, the Guizhou government organized a "Transportation Algorithm Battle" competition pitting firms that were asked to find ways to fix Guiyang's traffic congestion problems. Developers were given access to Department of Transportation data on traffic jams, as well as traffic light positions and timing data. Using this information, developers were able to create timing algorithms for traffic lights in order to mitigate congestion. That the Guiyang Transportation Department provided data, such as traffic light locations, and from what time to what time the traffic lights changed, was connected to a data transparency process.

License plate numbers of cars passing those traffic lights was sensitive data, as this involves individual privacy issues. Developers were only told how many cars passed a traffic light at a given time. As for the vehicle owners – who they were, what their occupations were, what their family situations were, etc. –

all such information had to be kept confidential.

Sometimes we have sensitive information, such as medical data concerning how many people are treated for AIDS annually. While it would be useful to analyze changes and trends related to the disease, publicly releasing this information might lead to social repercussions. As a rule of thumb, I think, sensitive data can be made public after it's desensitized. But the process should be handled with caution and a certain degree of control.

Are you under pressure to push for access to government data? Are there any future risks?

The pressure is always there, but now there's somewhat less. When we started, by no means were all departments willing to release data. We analyzed every department's actual situation, including whether we should consider their data sensitive, and how open their leaders were to new ideas. In the end, we chose seven departments that we felt would move ahead most easily, and brought them into the process of releasing data. But this was insufficient. Now, I want all departments to understand data transparency, and to release data to the greatest extent possible. So I still feel some stress. But as more data is made more transparent, I'll feel more relaxed.

As for the risks, they've been clear to us from the beginning. First, we might finish the data center even without forming a mechanism for supplying data. Second, data that's released might not be in a format that can be utilized. Thus, there's a risk in that we may lose control of the process.

Throughout this process, the Guizhou government has worked to avoid these risks. Regarding the first risk, the Guizhou government is working to integrate a great deal of data, make it transparent, create an environment where new ideas can be forged, work to attract developers and investors who can dig deep to unearth the data's true value, and create new industry models and urban centers. These steps can help foster a big data industry.

The second risk is a question of safety. We require that all infrastructure, computer hardware and operating platforms are made in China. We are carefully controlling the scale of publicizing data to ensure that all data is released safely.

(Rewritten by intern researchers Roma Eisenstark and Nick Horton.)

You've accessed an article available only to subscribers
Share this article
Open WeChat and scan the QR code