复制成功
  • 图案背景
  • 纯色背景

笔记

  • 2019-11-16
    为大人带来形象的羊生肖故事来历 为孩子带去快乐的生肖图画故事阅读
    谈谈怎样学好数学_苏步青-中学生文库
jrvbib335

上传于:2019-10-10

粉丝量:0

该文档贡献者很忙,什么也没留下。



applications of data management and analysis case studies in social networks and.

下载积分:2000

内容提示: Lecture Notes in Social NetworksApplications ofData Managementand AnalysisMohammad MoshirpourBehrouz H. FarReda Alhajj EditorsCase Studies in Social Networksand Beyond Lecture Notes in Social NetworksSeries editorsReda Alhajj, University of Calgary, Calgary, AB, CanadaUwe Glässer, Simon Fraser University, Burnaby, BC, CanadaHuan Liu, Arizona State University, Tempe, AZ, USARafael Wittek, University of Groningen, Groningen, The NetherlandsDaniel Zeng, University of Arizona, Tucson, AZ, USAAdvisory BoardCh...

文档格式:PDF| 浏览次数:149| 上传日期:2019-10-10 08:53:50| 文档星级:
Lecture Notes in Social NetworksApplications ofData Managementand AnalysisMohammad MoshirpourBehrouz H. FarReda Alhajj EditorsCase Studies in Social Networksand Beyond Lecture Notes in Social NetworksSeries editorsReda Alhajj, University of Calgary, Calgary, AB, CanadaUwe Glässer, Simon Fraser University, Burnaby, BC, CanadaHuan Liu, Arizona State University, Tempe, AZ, USARafael Wittek, University of Groningen, Groningen, The NetherlandsDaniel Zeng, University of Arizona, Tucson, AZ, USAAdvisory BoardCharu C. Aggarwal, Yorktown Heights, NY, USAPatricia L. Brantingham, Simon Fraser University, Burnaby, BC, CanadaThilo Gross, University of Bristol, Bristol, UKJiawei Han, University of Illinois at Urbana-Champaign,Urbana, IL, USARaúl Manásevich, University of Chile, Santiago, ChileAnthony J. Masys, University of Leicester, Ottawa, ON, CanadaCarlo Morselli, School of Criminology, Montreal, QC, Canada More information about this series at http://www.springer.com/series/8768 Mohammad Moshirpour • Behrouz H. FarReda AlhajjEditorsApplications of DataManagement and AnalysisCase Studies in Social Networks and Beyond123 EditorsMohammad MoshirpourDepartment Electrical & ComputerEngineeringUniversity of CalgaryCalgary, AB, CanadaReda AlhajjDepartment of Computer ScienceUniversity of CalgaryCalgary, AB, CanadaBehrouz H. FarDepartment Electrical & ComputerEngineeringUniversity of CalgaryCalgary, AB, CanadaISSN 2190-5428 ISSN 2190-5436 (electronic)Lecture Notes in Social NetworksISBN 978-3-319-95809-5 ISBN 978-3-319-95810-1 (eBook)https://doi.org/10.1007/978-3-319-95810-1Library of Congress Control Number: 2018954658© Springer Nature Switzerland AG 2018This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part ofthe material is concerned, specif i cally the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microf i lms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodologynow known or hereafter developed.Theuseofgeneral descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specif i c statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in this bookare believed to be true and accurate at the date of publication. Neither the publisher nor the authors orthe editors give a warranty, express or implied, with respect to the material contained herein or for anyerrors or omissions that may have been made. The publisher remains neutral with regard to jurisdictionalclaims in published maps and institutional aff i liations.This Springer imprint is published by the registered company Springer Nature Switzerland AGThe registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland PrefaceThe quantity of data in various science and engineering domains is increasing ata phenomenal rate, in structured and semi-structured formats. The data is charac-terized by its complexity, volume, high dimensionality, and velocity. Together withthis growth, there are a large number of data science-related tools and techniquesavailable for analyzing data and extract useful, actionable and reusable knowledge.Data science is an interdisciplinary f i eld. Theories, techniques, and tools aredrawn from various f i elds within mathematics, statistics, information science,software engineering, signal processing, probability models, machine learning,statistical learning, data mining, database systems, data engineering, pattern recog-nition, visualization, predictive analytics, uncertain modeling, data warehousing,data compression, artif i cial intelligence, and high performance computing. Practi-tioners in each domain need to know the characteristics of their data sets and selectappropriate data science tools and techniques and f i t them to their own problem.The goalof this volumeis to providepractical examplesof data science tools andtechniques f i tted to solve certain science and engineering problems. In this volume,the contributing authors provide examples and solutions in various engineering,business, medicine, bioinformatics, geomatics, and environmental science. Thiswill help professionals and practitioners to understand the benef i ts of data sciencein their domain and understand where a particular theory, technique, or tool wouldbe applicable and useful.Calgary, AB, Canada Mohammad MoshirpourCalgary, AB, Canada Behrouz H. FarCalgary, AB, Canada Reda Alhajjv ContentsPredicting Implicit Negative Relations in Online Social Networks ......... 1Animesh Gupta, Reda Alhajj, and Jon RokneAutomobile Insurance Fraud Detection Using Social Network Analysis .. 11Arezo Bodaghi and Babak TeimourpourImproving Circular Layout Algorithm for Social NetworkVisualization Using Genetic Algorithm........................................ 17Babak Teimourpour and Bahram AsgharpourLive Twitter Sentiment Analysis................................................ 29Dayne Sorvisto, Patrick Cloutier, Kevin Magnusson, Tauf i k Al-Sarraj,Kostya Dyskin, and Giri BerensteinArtif i cial Neural Network Modeling and Forecasting of OilReservoir Performance .......................................................... 43Ehsan Amirian, Eugene Fedutenko, Chaodong Yang, Zhangxin Chen,and Long NghiemA Sliding-Window Algorithm Implementation in MapReduce ............. 69Emad A. Mohammed, Christopher T. Naugler, and Behrouz H. FarA Fuzzy Dynamic Model for Customer Churn Prediction in RetailBanking Industry ................................................................ 85Fatemeh Saf i nejad, Elham Akhond Zadeh Noughabi, and Behrouz H. FarTemporal Dependency Between Evolution of Features and DynamicSocial Networks .................................................................. 103Kashf i a Sailunaz, Jon Rokne, and Reda Alhajjvii viii ContentsRecommender System for Product Avoidance ................................ 117Manmeet Dhaliwal, Jon Rokne, and Reda AlhajjA New 3D Value Model for Customer Segmentation: ComplexNetwork Approach............................................................... 129Mohammad Saeedi and Amir AlbadviFinding Inf l uential Factors for Different Types of Cancer: A DataMining Approach ................................................................ 147Munima Jahan, Elham Akhond Zadeh Noughabi, Behrouz H. Far,and Reda AlhajjEnhanced Load Balancer with Multilayer Processing Architecturefor Heavy Load Over Cloud Network ......................................... 169Navdeep Singh Randhawa, Mandeep Dhami, and Parminder SinghMarket Basket Analysis Using Community DetectionApproach: A Real Case.......................................................... 177Sepideh Faridizadeh, Neda Abdolvand, and Saeedeh Rajaee HarandiPredicting Future with Social Media Based on Sentimentand Quantitative Analysis....................................................... 199Sahil Sharma, Jon Rokne, and Reda AlhajjIndex............................................................................... 211 Predicting Implicit Negative Relationsin Online Social NetworksAnimesh Gupta, Reda Alhajj, and Jon RokneIntroductionSocial networkanalysis providesa plethoraof informationon relationshipsbetweentheusersofthatsocialnetwork.Asocialnetworkcontainsbothpositiveandnegativelinks. The link prediction problem [1, 2] can be used to study the latent relationshipbetween people because it is always hard to f i nd out explicitly what other peoplethink [3]. Although positive link prediction is quite common in a social networkanalysis, there exists a scarcity of research on negative link prediction. One of thereasons for this is the lack of datasets available to carry out the analysis for negativelink prediction. This is because many social networking f i rms such as Facebook,Twitter, and LinkedIn consider it pointless to collect negative link information,and therefore they do not even allow users to show dislike toward a post or acomment. But there are a few websites which allow users to express dislike towardother users. Two such websites are Epinions and Slashdot. We, therefore, use thedataset from these websites to carry out the prediction of negative links in socialnetworks.Slashdot is a technology news website. It features news stories on science andtechnologythat are submitted and evaluated by site users [4] which allows the userstoexpressbothpositiveandnegativelinkstowardotherusers[5].Thesite makesuseof a user-based moderation system in which the moderators are selected randomly.Moderation applies either −1 or +1 to the current rating, based onwhether theA. Gupta ( ? ) · R. Alhajj · J. RokneDepartment of Computer Science, University of Calgary, Calgary, AB, Canadae-mail: animesh.gupta@ucalgary.ca; alhajj@cpsc.ucalgary.ca; rokne@ucalgary.ca© Springer Nature Switzerland AG 2018M. Moshirpour et al. (eds.), Applications of Data Management and Analysis,Lecture Notes in Social Networks, https://doi.org/10.1007/978-3-319-95810-1_11 2 A. Gupta et al.Fig. 1 Summary of thedatasetHere is the summary of the Epinions and Slashdot dataset:Epinions SlashdotNodes 131,828 82,144Edges 841,372 549,202Positive Edges 85% 77.4%Negative Edges 15% 22.6%Fig. 2 Missing links in asocial networkcomment is perceived as either “normal,” “offtopic,” “insightful,” “redundant,”“interesting,” or “troll” (among others) [4]. The dataset chosen for this work is asigned dataset from February 21, 2009.Epinionsis a productreviewwebsite establishedin 1999where users can expressapproval or discontent toward the reviews posted by other users. This helps themto choose between buying a product or not by reading other people’s reviews forthe product they are interested in. The members can also choose to either trust ordistrust other members of the website [6–8]. All the trust relationships interact andform the Web of Trust which is then combined with review ratings to determinewhich reviews are shown to the user [9, 10].Here is the summary of the Epinions and Slashdot dataset (Fig. 1).We have used the Slashdot and Epinions dataset to create a logistic regressionclassif i er based on the given information about the already existing links (positiveornegative)betweenusers. Theobjectiveof this work is to predictthe links betweenthose users who do not have any link between them but may have positive ornegative links with other users of the network (Fig. 2).Related WorkIn[11],theauthorpredictsbothpositiveandnegativelinksin onlinesocial networksby using a set of 23 feature sets. The f i rst class of feature sets are based on thedegree of a node. In the second class of feature sets, each triad involving an edge(u,v) is considered. For each set (u,v), there also exists a “w” which completes thetriad and either has an edge coming from or going into u and similarly has an edge Predicting Implicit Negative Relations in Online Social Networks 3coming from or going into v. Since the edge can be in either direction and can bea positive or a negative edge, the total number of possible combinations is 16, andthese combinations translate into 16 different feature sets. It is believed that each ofthese 16 triad sets can providedifferentevidence about the sign of the edge betweenu and v. The datasets used for this work are Epinions, Slashdot, and Wikipedia. Abalanceddataset is thencreated to matchthe numberof negativelinks to the numberof positive links because positive links usually outnumber the number of negativelinks. A logistic regression classif i er is used to combine the evidence from theseindividualfeaturesintoanedgesignprediction.Thepredictiveaccuracyofdetectinga link is about 85% for degree feature set and triad feature set when consideredindependently whereas jumps to about 90% when both feature sets are consideredtogether.In [12], only positive links and content centric interactions are used to predict anegative link in a social network. This is a novel technique as there is an abundanceof the presence of positive links in a social network. They propose an algorithmNeLP (negative link prediction) which can exploit the positive links and contentcentric interactions to predict negativelinks [12].This approachis based on the ideathat although explicit informationis not available in most cases for social networks,a combination of positive links and content centric interaction can help to detectimplicit negativelinks. Jiliang Tang et al. also use the Epinions and Slashdot datasetfor NeLP. The F1-measure achieved is 0.32 and 0.31 when the training is done onEpinions and Slashdot dataset, respectively. The precision achieved by the negativelink prediction algorithm is 0.2861 for Epinions dataset and 0.2139 for Slashdotdataset.Comparedtootherbaselineapproaches,thisframeworkachievesimpressiveperformance improvement.Positive, implicit, and negative information of all users in a network is used byMin-Hee Jang et al. [13] and Cheng et al. [14]. Based on belief propagation, thetrust relationship of a user is determined. They use a metric called belief score andcalculate it for every user in the network based on that user’s interaction with hisneighbors.Everyuserwhichisa nodeinthesystemis assignedoneofthetwovaluesof being either trustable or distrustful. This method of trust prediction proposes ahigher accuracy by up to 10.1% and 20.6% compared to ITD and A BIT_L (similartrust prediction algorithms), respectively.Trustworthiness of users based on local trust metrics is studied upon in [10]and used to determine whether a controversial person is trustworthy or not. Thesame problem is also tackled in [15] using the trust antecedent framework. Also,Yang et al. [16] show that it is possible to infer signed social ties with goodaccuracy solely based on users’ behavior of decision making (or using only asmall fraction of supervision information) via unsupervised and semi-supervisedalgorithms. A survey of link prediction in complex network is done in [17]by Zhou et al. 4 A. Gupta et al.MethodologyDataset DescriptionThe Slashdot signed social network dataset from February 21, 2009, has 82,144nodesand549,202edges.Sincethisisasigneddataset,itcontainsinformationaboutthe nature of relationship between two users. Positive relationship is denoted by a“+1” and a negative relationship is denoted by a “−1.” About 77.4% of the edgesin this dataset are positive edges and only 22.6% are negative edges. Semantic Webapplications [18] and Web spam detection are used by Slashdot to control deviantbehavior and develop user rating mechanisms.Similarly, signed dataset of Epinions is used as well. Since Epinions is a productreview website, “+1” indicates a helpful review about a product by another user,and a “−1” indicates a non-trustable review by a user (Fig. 3).Fig. 3 Distribution of signededges Predicting Implicit Negative Relations in Online Social Networks 5Formulation in RLoading the DataThe datasets are taken from the Stanford Network Analysis Project [4]. Theextractedtext f i le is loaded into a data matrix in R. Within the data, the negativelinkinformation between two nodes is denoted by −1, and a positive link informationis denoted by a +1. Since we plan to apply logistic regression to our model, theclassif i er takes as input only those values which are either 0 or 1. To make thevalues consistent with the input/output of the classif i er, all the values in the datawhich are −1 are converted to 0. All the negative links between the nodes in ourdataset are now representedby a 0, and the positive links are representedby +1. Wedo this for both the datasets and create two data matrices which are transformed inthe subsequent steps to load the features and train the model.Transforming the Data by Loading FeaturesIt is important to choose the right set of features to base our machine learningmodelon. Common neighbors can reveal a lot of information about a user of a socialnetwork. Exploiting information about the number of friends and enemies can beused to predict his/her relationship with other users in the network. A user whois listed as a foe by majority of the other users has a high probability of beingdisliked by a random user in the network with whom he may not have any relationearlier. Based on the assumption that relationships with neighbors are a good basisfor selecting the feature set for the classif i er, we choose the following four featuresets to be used for the model:• udoutpos—Number of outgoing positive edges from “u”• udoutneg—Numberof outgoing negative edges from “u”• vdinpos—Number of incoming positive edges into “v”• vdoutneg—Numberof incoming negative edges into “v”We want to predict the relationship between the pair of users (u,v). Since wehave individual information about the relationship of u and v with their respectiveneighbors, we use that to create out a list of feature sets. This is implemented bytaking all the pairs of users in the dataset and counting the total number of +1’s or0’s for each pair. A new column is then added to the data matrix which is populatedwith the total number of the positive or negative link for every pair.Logistic RegressionThe dependent variable is dichotomous in logistic regression, and there are one ormore independent variables that determine an outcome. The dependent variable inour case is the edge sign between u and v. Since the edge sign could either be 0 6 A. Gupta et al.Fig. 4 Standard logisticfunctionindicating a negative relationship or +1 indicating a positive relationship, we usethe logistic model.The logistic function is:P (+|x) =11 + e − ( b 0 +? ni b i x i )where the feature vector is denoted by x and b 0 , b 1 ,..., b n are the weights orcoeff i cients which are determined by the classif i er based on the training dataset.Also, here is a graph for a standard logistic function (Fig. 4).Splitting the Data into Training and Testing DataWe follow the 90:10 split to divide the dataset into training and testing datasets. Thef i rst 90% of the data records are used for training the classif i er, and the remaining10% are used fortesting our model.Underthe 90:10split, 757,233data entries fromEpinions are used for training the model, and the remaining 84,137 are used to testour model. Similarly, 494,280 data entries from Slashdot are used for training themodel, and 54,920 data entries are used for testing.Fitting a Logistic Regression Model Using Training DataTo f i t a linear model to predict a categorical outcome, we make use of thegeneralized linear models (GLM). GLM are an extension of linear regressionmodels with which we can def i ne the dependent variables to be non-normal.The function glm (generalized linear model) in R is used to f i t the logisticregression to our training dataset. Glm in R is used to f i t generalized linear models,specif i ed by giving a symbolic description of the linear predictor and a description Predicting Implicit Negative Relations in Online Social Networks 7of the error distribution. The feature sets def i ned earlier are passed as the f i rst setof parameters. The training data and the glm family information are passed as thesecond and the third parameters, respectively. The parameter “family information”is specif i ed as binomial because the results we are trying to predict are binomial andcan take onlytwo values,0 for negativerelationshipand+1 forpositive relationshipbetween users.Using the Fitted Model to Do Predictions for the Test DataThe results returned by glm are stored in a variable called “linkpredictor.” This isthen used to predict the signs of the unknown links between nodes. The function“predict” in R is used to obtain predicted values based on linear model objects. Wepass three parameters, respectively, to this function as follows:• Linkpredictor• Testing data• TypeThe type of prediction is chosen as response.Results and DiscussionThe logistic regression classif i er converges successfully under the given feature setfor the training data. The standard error and z-statistic values for features 1 and 2and features 3 and 4 are as follows:Estimate Standard error Z-value Pr(>|z|)Intercept −2.469888 0.378926 −6.518 7.12 × 10 −11Feature 1 and 2 0.017103 0.001817 9.410 <2 × 10 −16Feature 3 and 4 2.286318 0.328871 6.952 3.60 × 10 −12Also, the number of f i sher scoring iterations is 15. Thus, the process ran for atotal of 15 iterations before outputting the results. Although the model predicts thecorrect link between two unknownnodes for about 90% of the time, the use of morefeature sets based on other factors apart from common neighbors and taking intoconsideration psychological theories such as social and balance theory can improvethe statistics signif i cantly.Figure 5 is a plot created in R showing the correlation between the features usedfor regression and the to and from node information along with the correspondingsign of the link. As can be seen here, there is a strong correlation between theselected features and the dataset. Therefore, we can say that the choice of feature 8 A. Gupta et al.Fig. 5 Correlation between the entries of the data matrixsets is a good one as they would help the regression model to convergequickly as isobserved in the f i sher scoring iterations as well.Future WorkMajor social-psychologicaltheories such as the status and the balance theorycan beuseful for detecting a link in a social network with increased accuracy [19].It is worthwhile to distinguish a friend from an enemy in a social network of user[20]. The balance theory is widely studied and is based on a few common socialfacts such as “the friend of my friend is my friend,” “the enemy of my friend ismy enemy,” “the friend of my enemy is my enemy,” and “the enemy of my enemyis my friend.” An extension to the balance theory was proposed by Chiang et al.in [21] in which they exploit both the local and global aspects of balance theoryfor sign prediction and clustering by studying the local patterns in social networks.The balance theory can be incorporated in our model by assuming that for a triadbetween a user x and a pair (u,v),we can assign a signed edge to the pair (u,v) basedon the kind of relationship x has with both u and v.The status theory was used in the research by Guha et al. [9] and Jure et al. [11]and, in both cases, has helped the authors to achieve better results for edge signprediction. A pair of users (u,v) are said to have a positive edge between them if uhas a higher status than v, whereas a negative sign exists between u and v if v hasa higher status than u. Status theory predicts that when the direction of an edge isf l ipped, its sign should f l ip as well [11].Status and balance theories may agree or disagree on the sign of an edge betweena given pair of users. For example, for a given pair of users (x,y), if x has a positivelink with w and w has a positive link with y, then both the theories predict a positivelink between x and y, but if y has a positive link with w and then w has a positivelink with x, balance theory will predict a positive link between x and y, but status Predicting Implicit Negative Relations in Online Social Networks 9theory will point toward a negative link between x and y because of the assumptionthat y has a higher status than x in this case.We also plan to add a few more feature sets in addition to the four feature setsused for this work. These feature sets can be based on the relationships betweendistant neighborsat pathlengthtwo or morebymakinguse ofthe fact that similaritybetween two individuals’ movements strongly correlates with their proximity in thesocial network [22]. This would result in a better trained model because we wouldbe moving away from considering only the effect of neighbors on a user.ConclusionNegativelinkpredictioncomplimentsthe positivelinkpredictionbyprovidingmoreinsight into the social network analysis such as better recommender systems. Edgesign prediction has mainly been done to predict positive links between the usersof a social network, whereas this work focuses on predicting negative links aswell. However, negative link prediction is challenging as many social networkingwebsites do not gather negative link information. For this work, we exploited theclosest neighbors from a pair of nodes to create a set of features and apply a logisticregression model to it.We also saw that there is a strong correlation between the feature set selectedto base our model on and the Epinions and the Slashdot dataset which is a strongindication toward the fact that the feature selection is indeed good.References1. Liben-Nowell, D., & Kleinberg, J. (2007). The link prediction problem for social networks.Journal of the American Society for Information Science and Technology, 58(7), 1019–1031.2. Chiang, K.-Y., Natarajan, N., Tewari, A., & Dhillon, I. S.. (2011). Exploiting longer cycles forlink prediction in signed networks. In Proceedings of the 20th ACM International Conferenceon Information and Knowledge Management (pp. 1157–1162). New York, NY: ACM.3. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trendsin Information Retrieval, 2(1–2), 1–135.4. Retrieved from https://snap.stanford.edu/data/soc-Epinions1.html5. Lampe, C., Johnston, E., & Resnick, P. (2007). Follow the reader: Filtering comments onSlashdot. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(pp. 1253–1262). San Jose, CA, USA, April 28–May 03, 2007.6. Kunegis, J., Lommatzsch, A., & Bauckhage, C. (2009). The Slashdot Zoo: Mining a socialnetwork with negative edges. In Proceedings of the 18th World Wide Web (pp. 741–750). NewYork, NY: ACM7. Ma, H., Lyu, M. R., & King, I. (2009). Learning to recommend with trust and distrustrelationships. In Proceedings of thethird ACM Conference on Recommender Systems (pp. 189–196). New York, NY: ACM. 10 A. Gupta et al.8. Kunegis, J., Preusse, J., & Schwagereit, F.. (2013). What is the added value of negative links inonline social networks? In Proceedings of the 22nd International Conference on World WideWeb (pp. 727–736). International World Wide Web Conferences Steering Committee.9. Guha, R. V., Kumar, R., Raghavan, P., & Tomkins, A. (2004). Propagation of trust and distrust.In Proceedings of the 13th International Conference on World Wide Web (pp. 403–412). NewYork, NY: ACM Press.10. Massa, P., & Avesani, P. (2005). Controversial users demand local trust metrics: An experi-mental study on epinions.com community. In AAAI 2005 (pp. 121–126). AAAI Press.11. Predicting Positive and Negative Links in Online Social Networks—Jure Leskovec; DanielHuttenlocher; Jon Kleinberg (2010).12. Negative Link Prediction in Social Media—Jiliang Tang; Shiyu Chang; Charu Aggarwal andHuan Liu (2015).13. Trust Prediction Using Positive, Implicit, and Negative Information—Min-Hee Jang; ChristosFaloutsos; Sang-Wook Kim (2014).14. Ye, J., Cheng, H., Zhu, Z., & Chen, M.. (2013). Predicting positive and negative links in signedsocial networks by transfer learning. In Proceedings of the 22nd International Conference onWorld Wide Web (pp. 1477–1488). Rio de Janeiro, Brazil, May 13–17.15. Nguyen, V., Lim, E., Jiang, J., & Sun, A. (2009). To trust or not to trust? Predicting onlinetrusts using trust antecedent framework. In Proceedings of the ICDM 2009 (pp. 896–901).16. Yang, S.-H., Smola, A. J., Long, B., Zha, H., & Chang, Y.. (2012). Friend or frenemy?Predicting signed ties in social networks. In Proceedings of the 35th International ACM SIGIRConference on Research and Development in Information Retrieval (pp. 555–564). Portland,Oregon, USA – August 12–16.17. Lü, L., & Zhou, T. (2011). Link prediction in complex networks: A survey. Physica A, 390(6),1150–1170.18. Richardson, M., Agrawal, R., & Domingos, P. (2003). Trust management for the semanticweb. In Proceedings of the Second International Conference on Semantic Web Conference (pp.351-368). Sanibel Island, FL, October 20–23.19. Leskovec, J., Huttenlocher, D., & Kleinberg, J. (2010). Signed networks in social media. InProceedings of the 28th CHI. New York, NY: ACM20. Brzozowski, M. J., Hogg, T., & Szabó, G. (2008). Friends and foes: Ideological socialnetworking. In Proceedings of the 26th CHI. New York, NY: Association for ComputingMachinery.21. Chiang, K.-Y., Hsieh, C.-J., Natarajan, N., Tewari, A., & Dhillon, I. S.. Prediction and cluster-ing in signed networks: A local to global perspective. arXiv preprint arXiv:1302.5145,201.22. Wang, D., Pedreschi, D., Song, C., Giannotti, F., & Barabási, A. Human mobility, socialties, and link prediction. In Proceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (pp. 1100–1108). San Diego, CA, USA, August21–24. Automobile Insurance Fraud DetectionUsing Social Network AnalysisArezo Bodaghi and Babak TeimourpourIntroductionThis paper proposes an automated system for detecting groups of perpetratorsin automobile insurance. This system employs a network analysis to identifysuspicious behaviors in automobile collisions. There are two types of insurancefraud including opportunistic and professional. An opportunistic fraud is usuallycommitted by a person who simply has an opportunity to increase in price aclaim or get an overstated an estimate for losses or repairs to his/her insurancecompany,while the professional fraud is often committed by organizedgroups withmultiple, false identities, aiming multiple organizations or brands. These rings ofcrime often carry out through insiders to help them defraud the company usingsome ways at once. Although the price amount per incident is far greater, theincidence of professional fraud is lower than usual insurance fraud [1]. Fightingagainst insurance fraud is a challenging problem. Most traditional systems are ableto f i nd opportunistic frauds, though insurance companies are highly interested indetection of organized groups because of the aforementioned reason (bringing themost f i nancial losses). As a result, the insurance companies need to apply moderntechnologies and intelligent systems to cope with this problem. Presentation of thepaper is structured as follows. In Section “Literature Review,” the literature relatedto our research will be presented. Section “Research Methodology” introduces anew method for detection of fraudulent groups. Next, in section “Evaluation withthe Prototype System,” the proposed system was evaluated on real-world data.Concluding remarks are given in section “Summary and Conclusion.”A. Bodaghi · B. Teimourpour ( ? )Department of IT Engineering, School of Industrial and Systems Engineering, Tarbiat ModaresUniversity, Tehran, Irane-mail: arezo.bodaghi@modares.ac.ir; b.teimourpour@modares.ac.ir© Springer Nature Switzerland AG 2018M. Moshirpour et al. (eds.), Applications of Data Management and Analysis,Lecture Notes in Social Networks, https://doi.org/10.1007/978-3-319-95810-1_211 12 A. Bodaghi and B. TeimourpourLiterature ReviewThe insurance fraud has been seriously taken into account in recent years. Despitethe fact that this issue is seen more in practical and functional f i elds, it has beenalso considered in terms of academic aspect due to its negative effect on insurancepricing and also on eff i ciency of insurance industry.The most existing systems apply anomalydetectionto identifysuspicious groups(e.g., [2]). The literature on organized fraud detection is extremely sparse. Tothe best of our knowledge, it involves PRIDIT analysis proposed by [3] that isbased on RIDIT scores and principal component analysis which has been appliedby [4] for detection of suspicious components. Afterward, the system determinesthe suspicious entities in all detected suspicious components through IterativeAssessment Algorithm (IAA). [2] presented a new unsupervised ranking methodfor detection of anomalies with taking into consideration both rare class ranking,in which anomaly is assessed with respect to a single majority class, and anomalyranking which is assessed with respect to more than one major pattern. [5] providedamethodformodelingandanalysisofdirectedweightedaccidentcausationnetwork(DWACN). Accordingto their paper,the theoryof complexnetwork analysis theoryis effective for understanding and analyzing the cause of accidents in complexsystems. They introduced a new method for creating the directed weighted accidentcausation network (DWACN) for studied branches and investigated the accidentc...

关注我们

关注微信公众号

您选择了以下内容