MEASURING INFLUENCE ON INSTAGRAM: A NETWORK-OBLIVIOUS APPROACH

PDF Publication Title:

MEASURING INFLUENCE ON INSTAGRAM: A NETWORK-OBLIVIOUS APPROACH ( measuring-influence-on-instagram-network-oblivious-approach )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 003

MEASURING INFLUENCE ON INSTAGRAM: A NETWORK-OBLIVIOUS APPROACH 3 (a) Views Histogram (b) Views per Followers (c) Views per Likes Fig. 1. Distributions per Instagrammer 5 • focus - The difference and ratio between most and least engaged post, these features were designed to test the variance and stability of a user engagement level. REGRESSION MODELS 6.1 Methodology To compare between different models, we employ two com- monly used statistics. To test the model’s ability to mea- sure influence, we employ the coefficient of determination, denoted R2. Bound by 1, higher R2 scores would indicate lower error variance which indicates a tighter model. Com- paring the order of the predicted influence with the real influence allows us to rank users. To test the resulting rank- ing created we use Spearman’s rank correlation coefficient, denoted rs . To avoid the problems of a model tuned specifically to the test data, we use a five-fold cross validation technique. We randomly split U into five equally sized sets of disjoint Instagrammers and use them as five train-test datasets, each test set contains roughly 20% the size of the original set of users U and the train set is made of the remaining 80%. The results are averaged on the five test cases. 6.2 Baselines Two natural baselines for measuring influence are to use the user’s audience size (followers) or engagement level (number of likes). We use both statistics baselines, utilizing a Linear Regression model. While outside our scope, for completeness purposes, we used the PageRank extension suggested by Egger [33]. For this, we crawled Instagram, creating a commentators graph around our test users. 6.3 Comparison of Techniques The results of the R2 and rs statistics for the regression models and baselines are provided in Table 1. These results include both clustered and unclustered attempts, as well as, show the result of the feature reduced models. It is clear that the followers statistic, while intuitive and is often used in real-world scenarios, is the weakest on any given metric. This correlates with previous findings by Cha et al. [24]. The engagement baseline is the best choice for a direct ranking approach as it is almost the best, certainly within error range, and is much simpler to use than the full regression models. Amongst our suggested models, Multi-Regression was not a useful approach while feature reduction still resulted in strong models with only half the features. When compar- ing RR and RF, we clearly see that RR is a more accurate model. This is due to a limitation of the RF model - while RR can return any possible value, RF models can return only We attempt to measure influence using well known re- gression models via the features described at Section 4.2. Furthermore, as some models are sensitive to redundant fea- tures, we perform recursive feature elimination, generating a subset of informative features for the problem at hand. The models tested include: • Ridge Regression(RR) - An extension of Linear Regres- sion, RR attempts to overcome Linear Regressions’ problem with feature multi-collinearity adding l2 norm regularization of the coefficients to the mini- mization problem [35]. • Random Forest(RF) - Non-linear algorithms that rely on ensembles of decision trees with randomness injected into the model in both features and instances selection [36]. We also introduce a meta-algorithm expansion of our own. It is clear that not all influencers should be handled in the same manner and celebrities statistics would show vast differences than those of micro-influencers. We propose a Multiple-Regression model, where data is separated to subsets, in our case, using the K-Means clustering algorithm on the followers statistic [37], and building a regression model for each subset. Finally, it can be seen in Figures 1b and 1c that the likes and followers’ statistics grow in an exponential manner. To handle potential bias towards these features, both in cluster- ing and regression, we transform these statistics using a log scale, i.e., f (x) = x . lnx 6 EXPERIMENTAL RESULTS In this section, we present the methodology for evaluating different techniques and introduce two simple yet com- monly used baselines. We test our models and present the results of our attempt to measure the influence of Instagram users.

PDF Image | MEASURING INFLUENCE ON INSTAGRAM: A NETWORK-OBLIVIOUS APPROACH

PDF Search Title:

MEASURING INFLUENCE ON INSTAGRAM: A NETWORK-OBLIVIOUS APPROACH

Original File Name Searched:

instagram-measuring-influence.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)