Distribution enhancement for imbalanced data with generative adversarial network

dc.authorscopusidWitold Pedrycz / 58861905800
dc.authorwosidWitold Pedrycz / FPE-7309-2022
dc.contributor.authorChen, Yueqi
dc.contributor.authorPedrycz, Witold
dc.contributor.authorPan, Tingting
dc.contributor.authorWang, Jian
dc.contributor.authorYang, Jie
dc.date.accessioned2025-04-18T09:17:35Z
dc.date.available2025-04-18T09:17:35Z
dc.date.issued2024
dc.departmentİstinye Üniversitesi, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü
dc.description.abstractTackling imbalanced problems encountered in real-world applications poses a challenge at present. Oversampling is a widely useful method for imbalanced tabular data. However, most traditional oversampling methods generate samples by interpolation of minority (positive) class, failing to entirely capture the probability density distribution of the original data. In this paper, a novel oversampling method is presented based on generative adversarial network (GAN) with the originality of introducing three strategies to enhance the distribution of the positive class, called GAN-E. The first strategy is to inject prior knowledge of positive class into the latent space of GAN, improving sample emulation. The second strategy is to inject random noise containing this prior knowledge into both original and generated positive samples to stretch the learning space of the discriminator of GAN. The third one is to use multiple GANs to learn comprehensive probability distributions of positive class based on multi-scale data to eliminate the influence of GAN on generating aggregate samples. The experimental results and statistical tests obtained on 18 commonly used imbalanced datasets show that the proposed method comes with a better performance in terms of G-mean, F-measure, AUC and accuracy than 14 other rebalanced methods. This paper introduces three strategies to improve the ability of GAN to handle imbalanced data. The first strategy is to inject prior knowledge into the latent space of GAN. The second strategy is to inject random noise into the discriminator. The third one is to use multiple GANs to learn comprehensive probability distributions of positive class based on multi-scale data. image
dc.description.sponsorshipNational Key Research & Development Program of China Fundamental Research Funds for the Central Universities National Natural Science Foundation of China (NSFC)
dc.identifier.citationChen, Y., Pedrycz, W., Pan, T., Wang, J., & Yang, J. Distribution Enhancement for Imbalanced Data with Generative Adversarial Network. Advanced Theory and Simulations, 2400234.
dc.identifier.doi10.1002/adts.202400234
dc.identifier.issn2513-0390
dc.identifier.scopus2-s2.0-85196549785
dc.identifier.scopusqualityQ1
dc.identifier.urihttp://dx.doi.org/10.1002/adts.202400234
dc.identifier.urihttps://hdl.handle.net/20.500.12713/6747
dc.identifier.wosWOS:001251541400001
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.institutionauthorPedrycz, Witold
dc.institutionauthoridWitold Pedrycz / 0000-0002-9335-9930
dc.language.isoen
dc.publisherWiley
dc.relation.ispartofAdvanced theory and simulations
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectGAN (Generative Adversarial Network)
dc.subjectImbalanced Learning Mode Collapse
dc.subjectOversampling
dc.titleDistribution enhancement for imbalanced data with generative adversarial network
dc.typeArticle

Dosyalar

Lisans paketi
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
license.txt
Boyut:
1.17 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: