Today, opinion mining is one the most important applications of natural language processing which requires special methods to process documents due to the high volume of comments produced. Since the users’ opinions on social networks and e-commerce websites constitute a More
Today, opinion mining is one the most important applications of natural language processing which requires special methods to process documents due to the high volume of comments produced. Since the users’ opinions on social networks and e-commerce websites constitute an evolving stream, the application of traditional non-incremental classification algorithm for opinion mining leads to the degradation of the classification model as time passes.
Moreover, because the users’ comments are massive, it is not possible to label enough comments to build training data for updating the learned model. Another issue in incremental opinion mining is the concept drift that should be supported to handle changing class distributions and evolving vocabulary.
In this paper, a new incremental method for polarity detection is proposed which with the application of stream-based active learning selects the best documents to be labeled by experts and updates the classifier. The proposed method is capable of detecting and handling concept drift using a limited labeled data without storing the documents. We compare our method with the state of the art incremental and non-incremental classification methods using credible datasets and standard evaluation measures. The evaluation results show the effectiveness of the proposed method for polarity detection of opinions.
Manuscript profile
Support Vector machine is one of the most popular and efficient algorithms in machine learning. There are several versions of this algorithm, the latest of which is the fuzzy least squares twin support vector machines. On the other hand, in many machine learning applica More
Support Vector machine is one of the most popular and efficient algorithms in machine learning. There are several versions of this algorithm, the latest of which is the fuzzy least squares twin support vector machines. On the other hand, in many machine learning applications input data is continuously generated, which has made many traditional algorithms inefficient to deal with them. In this paper, for the first time, an incremental version of the fuzzy least squares twin support vector algorithm is presented. The proposed algorithmis represented in both online and quasi-online modes. To evaluate the accuracy and precision of the proposed algorithmfirst we run our algorithm on 6 datasets of the UCI repository. Results showthe proposed algorithm is more efficient than other algorithms (even non-incremental versions). In the second phase in the experiments, we consider an application of Internet of Things, and in particular in data related to daily activities which inherently are incremental. According to experimental results, the proposed algorithm has the best performance compared to other incremental algorithms.
Manuscript profile
Streaming data refers to data that is continuously generated in the form of fast streams with high volumes. This kind of data often runs into evolving environments where a change may affect the data distribution. Because of a wide range of real-world applications of dat More
Streaming data refers to data that is continuously generated in the form of fast streams with high volumes. This kind of data often runs into evolving environments where a change may affect the data distribution. Because of a wide range of real-world applications of data streams, performance improvement of streaming analytics has become a hot topic for researchers. The proposed method integrates online ensemble learning into extreme machine learning to improve the data stream classification performance. The proposed incremental method does not need to access the samples of previous blocks. Also, regarding the AdaBoost approach, it can react to concept drift by the component weighting mechanism and component update mechanism. The proposed method can adapt to the changes, and its performance is leveraged to retain high-accurate classifiers. The experiments have been done on benchmark datasets. The proposed method can achieve 0.90% average specificity, 0.69% average sensitivity, and 0.87% average accuracy, indicating its superiority compared to two competing methods.
Manuscript profile