Paper Title
IMAGE CAPTION GENERATION USING LSTMS AND CONVOLUTIONAL NEURAL NETWORKS
Article Identifiers
Registration ID: IJNRD_205981
Published ID: IJNRD2309391
DOI: http://doi.one/10.1729/Journal.36336
Authors
D Kanishya Gayathri , L Prinslin
Keywords
Image Captioning, Convolutional Neural Networks, LSTMs, Deep Learning, Computer Vision, Natural Language Processing, MSCOCO Dataset, Data Preprocessing, Model Architecture, Training, Evaluation Metrics, Encoder-Decoder, Teacher Forcing, Cross-Entropy Loss, Metric-Based Evaluation, BLEU, METEOR, CIDEr, ROUGE, Multimodal AI, Visual Understanding, Text Generation, Image Description, Machine Learning, Deep Neural Networks.
Abstract
Image caption generation is a captivating intersection of computer vision and natural language processing, with applications spanning assistive technology, content retrieval, and human-computer interaction. In this project, we delve into the fusion of Convolutional Neural Networks (CNNs) and Long Short-Term Memory networks (LSTMs) to address this intriguing challenge. Our project centres on the holistic development of an image captioning system. We kick-started the journey with meticulous data collection and pre-processing, leveraging established datasets like MSCOCO. The images underwent resizing, normalization, and feature extraction using a pre-trained CNN, serving as our image encoder. The nucleus of our innovation resides in the model architecture. We meticulously designed a two-tiered structure, comprising a CNN as the image encoder and an LSTM as the text decoder. The CNN's role is to extract salient image features, while the LSTM excels at generating coherent and contextually relevant captions, guided by a cross-entropy loss function during training. Teacher forcing further stabilizes our model's convergence. Our system underwent rigorous evaluation employing a battery of metrics, including BLEU, METEOR, CIDEr, and ROUGE, benchmarking the generated captions against human-annotated references. This quantitative analysis provides an in-depth perspective on the system's performance and underscores areas for enhancement. As we conclude this report, we reflect on the challenges encountered throughout our project's lifecycle and propose avenues for future research in the captivating realm of image caption generation. Our aspiration is to refine and enhance our system, enabling it to generate not just accurate but also contextually rich captions—an endeavor that advances the frontiers of multimodal AI applications. In summary, this project showcases the symbiotic relationship between computer vision and natural language processing, underscoring the potential of CNN-LSTM architectures in the captivating domain of image captioning. Our work contributes to the burgeoning field of multimodal AI, fostering innovative applications across diverse domains.
Downloads
How To Cite
"IMAGE CAPTION GENERATION USING LSTMS AND CONVOLUTIONAL NEURAL NETWORKS", IJNRD - INTERNATIONAL JOURNAL OF NOVEL RESEARCH AND DEVELOPMENT (www.IJNRD.org), ISSN:2456-4184, Vol.8, Issue 9, page no.d775-d781, September-2023, Available :https://ijnrd.org/papers/IJNRD2309391.pdf
Issue
Volume 8 Issue 9, September-2023
Pages : d775-d781
Other Publication Details
Paper Reg. ID: IJNRD_205981
Published Paper Id: IJNRD2309391
Downloads: 000121132
Research Area: Engineering
Country: Chennai, Tamilnadu, India
Published Paper PDF: https://ijnrd.org/papers/IJNRD2309391.pdf
Published Paper URL: https://ijnrd.org/viewpaperforall?paper=IJNRD2309391
DOI: http://doi.one/10.1729/Journal.36336
About Publisher
Journal Name: INTERNATIONAL JOURNAL OF NOVEL RESEARCH AND DEVELOPMENT(IJNRD)
ISSN: 2456-4184 | IMPACT FACTOR: 8.76 Calculated By Google Scholar | ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.76 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator
Publisher: IJNRD (IJ Publication) Janvi Wave
Licence
This work is licensed under a Creative Commons Attribution 4.0 International License and The Open Definition


Publication Timeline
Article Preview: View Full Paper
Call For Paper
IJNRD is Scholarly open access journals, Peer-reviewed, and Refereed Journals, High Impact factor 8.76 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool), Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI) with Open-Access Publications.
INTERNATIONAL JOURNAL OF NOVEL RESEARCH AND DEVELOPMENT (IJNRD) aims to explore advances in research pertaining to applied, theoretical and experimental Technological studies. The goal is to promote scientific information interchange between researchers, developers, engineers, students, and practitioners working in and around the world. IJNRD will provide an opportunity for practitioners and educators of engineering field to exchange research evidence, models of best practice and innovative ideas.
Indexing In Google Scholar, SSRN, ResearcherID-Publons, Semantic Scholar | AI-Powered Research Tool, Microsoft Academic, Academia.edu, arXiv.org, Research Gate, CiteSeerX, ResearcherID Thomson Reuters, Mendeley : reference manager, DocStoc, ISSUU, Scribd, and many more
How to submit the paper?
By Our website
Click Here to Submit Paper Online
Important Dates for Current issue
Paper Submission Open For: August 2025
Current Issue: Volume 10 | Issue 8
Last Date for Paper Submission: Till 31-Aug-2025
Notification of Review Result: Within 1-2 Days after Submitting paper.
Publication of Paper: Within 01-02 Days after Submititng documents.
Frequency: Monthly (12 issue Annually).
Journal Type: International Peer-reviewed, Refereed, and Open Access Journal.
Subject Category: Research Area