MinJu Jeon

Hi, I'm MinJu Jeon — a Master's student in Data Science at Hanyang University, currently a Research Intern at Naver Cloud(Voice Tech). I work on problems at the intersection of language and perception: multilingual speech (G2P/TTS), video-language understanding, and the messy data work that makes models actually usable in production.

Multimodal Learning Video-Text Retrieval Speech & Language Data-Centric AI

News

Mar 2026Cap4Bridge accepted at IEEE Access 2026

Feb 2026Two papers accepted at CVPR 2026

Dec 2025Started research internship at Naver Cloud, Voice Tech Team

Aug 2025Sali4Vid accepted at EMNLP 2025 (Long, Main)

Background

Dec. 2025 – now

Research Intern, Naver Cloud · Voice Tech Team
Multilingual G2P & robust TTS for non-canonical text

Sep. 2024 – now

M.S. Data Science, Hanyang University

March 2020 - Aug. 2024

B.S. Industrial Engineering, Hanyang University

selected publications

EMNLP

Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning

MinJu Jeon, Si-Woo Kim, Ye-Chan Kim, HyunGee Kim, and Dong-Jin Kim

In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Bib

@inproceedings{jeon2025sali4vid,
  title = {Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning},
  author = {Jeon, MinJu and Kim, Si-Woo and Kim, Ye-Chan and Kim, HyunGee and Kim, Dong-Jin},
  booktitle = {Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  pages = {25788--25801},
  year = {2025},
}

IEEE Access

Cap4Bridge: Caption-Guided Cross-Modal Contextualization with Stochastic Augmentation for Text-Video Retrieval

MinJu Jeon, Hyungee Kim, Si-Woo Kim, Youngtaek Oh, Soeun Lee, and Dong-Jin Kim

IEEE Access, 2026

Bib

@article{jeon2026cap4bridge,
  title = {Cap4Bridge: Caption-Guided Cross-Modal Contextualization with Stochastic Augmentation for Text-Video Retrieval},
  author = {Jeon, MinJu and Kim, Hyungee and Kim, Si-Woo and Oh, Youngtaek and Lee, Soeun and Kim, Dong-Jin},
  journal = {IEEE Access},
  year = {2026},
}

ACM MM

SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning

Si-Woo Kim, MinJu Jeon, Ye-Chan Kim, Soeun Lee, Taewhan Kim, and Dong-Jin Kim

In Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Bib

@inproceedings{kim2025sync,
  title = {SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning},
  author = {Kim, Si-Woo and Jeon, MinJu and Kim, Ye-Chan and Lee, Soeun and Kim, Taewhan and Kim, Dong-Jin},
  booktitle = {Proceedings of the 33rd ACM International Conference on Multimedia},
  pages = {2683--2692},
  year = {2025},
}

CVPR

Follow the Saliency: Supervised Saliency for Retrieval-augmented Dense Video Captioning

Seunghee Choi, MinJu Jeon, Hyunwoo Oh, Jihwan Lee, and Dong-Jin Kim

arXiv preprint arXiv:2603.11460, 2026

Accepted at CVPR 2026

Bib

@article{choi2026follow,
  title = {Follow the Saliency: Supervised Saliency for Retrieval-augmented Dense Video Captioning},
  author = {Choi, Seunghee and Jeon, MinJu and Oh, Hyunwoo and Lee, Jihwan and Kim, Dong-Jin},
  journal = {arXiv preprint arXiv:2603.11460},
  note = {Accepted at CVPR 2026},
  year = {2026},
}

CVPR

SAIL: Similarity-Aware Guidance and Inter-Caption Augmentation-based Learning for Weakly-Supervised Dense Video Captioning

Ye-Chan Kim, SeungJu Cha, Si-Woo Kim, MinJu Jeon, Hyungee Kim, and Dong-Jin Kim

arXiv preprint arXiv:2603.05437, 2026

Accepted at CVPR 2026

Bib

@article{kim2026sail,
  title = {SAIL: Similarity-Aware Guidance and Inter-Caption Augmentation-based Learning for Weakly-Supervised Dense Video Captioning},
  author = {Kim, Ye-Chan and Cha, SeungJu and Kim, Si-Woo and Jeon, MinJu and Kim, Hyungee and Kim, Dong-Jin},
  journal = {arXiv preprint arXiv:2603.05437},
  note = {Accepted at CVPR 2026},
  year = {2026},
}