Big Data tool for Social Media Statistics Analysis
One of the world’s well-known social networks hired SCAND development service team to analyze their users’ insights: a number of new users joining the network daily, their activity and online behavioral patterns, the detection of viral patterns, collaborative filtering.
The customer wanted to track users’ digital profiles and in case needed to get access to all required information ad hoc to facilitate ad campaigns running online.
- Perform automatic reporting.
- Make analytical views in the form of tables and charts.
- Provide summary views.
- Create analytical dashboards.
- Gather analytics and keep it in the key-value storage via Kafka and HBASE.
- Capture file-based data with the help of Cassandra.
- Make SQL alike requests and proceed them through Hive.
- Create targeting campaigns.
The development team decided to perform a general uplift of the customer’s online analytics system using a bleeding-edge technology platform. The reinforced solution was required to provide customizable and flexible access to the analytical data of users’ digital profiles: static and dynamic. It is also supposed to track demographic statistics, pulled together from a variety of sources.
Firstly, SCAND Big Data developers add anchor points to each social network page (1×1 pixel images) and such manipulation helps to mark and track all users via cookies. So, if there are new users with no cookies attached, they are added. The data is collected in HBASE with Kafka consumed.
Secondly, these marks are gathered together with social network profiles data: age, sex, user-defined tags, etc. Once that is performed, a chance to move forward and add link translation for the majority of ‘native’ links in the users’ posts becomes available. In other words, if a user makes such a link available on their walls, the clicks and users’ interests are intercepted in the same manner as for the views. Cosine-based collaborative filtering is applied to the set of per-user views and clicks. It means that if a user clicks on similar links and looks through similar walls it is a good idea to let show them the block ‘see also’ based on votes from other users having similar interests.
Once, digital profiles are obtained, filters on age groups, sex and tags already gathered were added. It increases the relevancy of proposed blocks in a few more fractions of the percentage.
Now the target campaign is ready to be launched. Control groups with ‘native’ traffic are added and other groups are targeted through viral news, specific advertisements. To track changes in the users’ behavioral models some prepared news posts are used. This lets network owners get a powerful tool to manage the advertising content.
Big Data implementation in social media.
Having Big Data in social media implemented and organizing the heavy use of statistical analysis and gathering gave the project extra abilities to pre-provision the hardware equipment in case new viral news arrives and let the advertising campaign run smoothly and touch only the target audience. The simple analysis of ‘content boosters’ makes the changes in users’ digital profiles visible and the resources of the network are eventually used in a more efficient way.