Image Annotation and Retrieval
Sajad Mohamadzadeh; Mohammad Gharehbagh
Abstract
Background and Objectives: Content-Based Image Retrieval (CBIR) systems are crucial for managing the exponential growth of digital imagery. Traditional methods relying on handcrafted features often fail to scale and capture semantic content. Although deep learning enhances retrieval quality, challenges ...
Read More
Background and Objectives: Content-Based Image Retrieval (CBIR) systems are crucial for managing the exponential growth of digital imagery. Traditional methods relying on handcrafted features often fail to scale and capture semantic content. Although deep learning enhances retrieval quality, challenges persist in computational complexity and efficiency. This paper introduces a hybrid CBIR framework that combines unsupervised deep feature learning, adaptive hashing, and VP-Tree-based hierarchical search optimization. The proposed system, evaluated on CIFAR-10, ImageNet subset, and a custom medical imaging dataset, achieves a mean average precision (mAP) of 96.1% and reduces retrieval latency by approximately 40% compared to conventional methods. By leveraging autoencoder-driven latent feature extraction and scalable metric space partitioning, our framework demonstrates superior performance in scalability, retrieval speed, and accuracy for large-scale applications.Methods: The proposed framework employs autoencoder-driven latent space encoding to extract compact yet semantically rich feature representations, ensuring robust discriminability across diverse image categories. To enhance retrieval efficiency, a hybrid search mechanism is implemented: a Euclidean-based nearest neighbor scheme O(N log N) is used for moderate-scale datasets, while a VP-Tree-based hashing scheme O(log N) is applied for large-scale retrieval scenarios. By leveraging hierarchical metric space partitioning, the method significantly reduces search complexity while maintaining retrieval accuracy.Results: Extensive evaluations show the proposed framework outperforms traditional and modern deep hashing techniques, achieving higher mean average precision, lower search latency, and better storage efficiency for both moderate and large-scale datasets. By integrating unsupervised representation learning, advanced hashing, and optimized search structures, the system surpasses conventional methods in speed and precision.Conclusion: This study presents a highly scalable and computationally efficient CBIR framework that addresses the limitations of existing methods by combining unsupervised deep feature learning, adaptive hashing, and hierarchical search structures. The results highlight the framework's ability to achieving high retrieval accuracy and efficiency, thus making it suitable for real-time applications in large-scale multimedia repositories.
Image Annotation and Retrieval
A. Gheitasi; H. Farsi; S. Mohamadzadeh
Abstract
Background and Objectives: Freehand sketching is an easy-to-use but effective instrument for computer-human connection. Sketches are highly abstract to the domain gap, that exists between the intended sketch and real image. In addition to appearance information, it is believed that shape information ...
Read More
Background and Objectives: Freehand sketching is an easy-to-use but effective instrument for computer-human connection. Sketches are highly abstract to the domain gap, that exists between the intended sketch and real image. In addition to appearance information, it is believed that shape information is also very efficient in sketch recognition and retrieval. Methods: In the realm of machine vision, comprehending Freehand Sketches has grown more crucial due to the widespread use of touchscreen devices. In addition to appearance information, it is believed that shape information is also very efficient in sketch recognition and retrieval. The majority of sketch recognition and retrieval methods utilize appearance information-based tactics. A hybrid network architecture comprising two networks—S-Net (Sketch Network) and A-Net (Appearance Network)—is shown in this article under the heading of hybrid convolution. These subnetworks, in turn, describe appearance and shape information. Conversely, a module known as the Conventional Correlation Analysis (CCA) technique module is utilized to match the range and enhance the sketch retrieval performance to decrease the range gap distance. Finally, sketch retrieval using the hybrid Convolutional Neural Network (CNN) and CCA domain adaptation module is tested using many datasets, including Sketchy, Tu-Berlin, and Flickr-15k. The final experimental results demonstrated that compared to more sophisticated methods, the hybrid CNN and CCA module produced high accuracy and results.Results: The proposed method has been evaluated in the two fields of image classification and Sketch Based Image Retrieval (SBIR). The proposed hybrid convolution works better than other basic networks. It achieves a classification score of 84.44% for the TU-Berlin dataset and 82.76% for the sketchy dataset. Additionally, in SBIR, the proposed method stands out among methods based on deep learning, outperforming non-deep methods by a significant margin. Conclusion: This research presented the hybrid convolutional framework, which is based on deep learning for pattern recognition. Compared to the best available methods, hybrid network convolution has increased recognition and retrieval accuracy by around 5%. It is an efficient and thorough method which demonstrated valid results in Sketch-based image classification and retrieval on TU-Berlin, Flickr 15k, and sketchy datasets.