Spoken Term Detection (STD) approaches can be divided into two main groups: Hidden Markov Model (HMM)-based and Discriminative STD (DSTD) approaches. One of the important advantages of HMM-based methods is that they can use context dependent (diphone or triphones) infor More
Spoken Term Detection (STD) approaches can be divided into two main groups: Hidden Markov Model (HMM)-based and Discriminative STD (DSTD) approaches. One of the important advantages of HMM-based methods is that they can use context dependent (diphone or triphones) information to improve the whole STD system performance. On the other hand, lack of triphones information is one of the significant drawbacks of DSTD methods. In this paper, we propose a solution to overcome this drawback of DSTD systems. To this end, we modify the feature extraction part of an Evolutionary DSTD (EDSTD) system to consider triphones information. At first, we propose a monophone-based feature extraction part for the EDSTD system. Then, we propose an approach for exploiting triphones information in the EDSTD system. The results on TIMIT database indicate that the true detection rate of the triphone-based EDSTD (Tph-EDSTD) system, in false alarm per keyword per hour greater than two, is about 3% higher than that of the monophone-based EDSTD (Mph-SDSTD) system. This improvement costs about 36% degradation of the system response speed which is neglected.
Manuscript profile
One of the challenges of high dimensional outlier detection problem is the curse of dimensionality which irrelevant dimensions (features) lead to hidden outliers. To solve this problem, some dimensions that contain valuable information to detect outliers are searched to More
One of the challenges of high dimensional outlier detection problem is the curse of dimensionality which irrelevant dimensions (features) lead to hidden outliers. To solve this problem, some dimensions that contain valuable information to detect outliers are searched to make outliers more prominent and detectable by mapping the dataset into the subspace which is constituted of these relevant dimensions/features. This paper proposes an outlier detection method in high dimensional data by introducing a new locally relevant subspace selection and developing a local density-based outlier scoring. First, we present a locally relevant subspace selection method based on local entropy to select a relevant subspace for each data point due to its neighbors. Then, each data point is scored in its relevant subspace using a density-based local outlier scoring method. Our adaptive-bandwidth kernel density estimation method eliminates the slight difference between the density of a normal data point and its neighbors. Thus, normal data are not wrongly detected as outliers. At the same time, our method underestimates the actual density of outlier data points to make them more prominent. The experimental results on several real datasets show that our local entropy-based subspace selection algorithm and the proposed outlier scoring can achieve a high accuracy detection rate for the outlier data.
Manuscript profile