Publications ordered by type in reverse chronological order.Order by: year
Journals
Multi-Task Adaptive Gating Network for Trajectory Distilled Control Prediction
Shoaib Azam, and Ville Kyrki
IEEE Robotics and Automation Letters, 2024
End-to-end autonomous driving is often categorized based on output into trajectory prediction or control prediction. Each type of approach provides benefits in different contexts, resulting in recent studies on how to combine them. However, the current proposals are based on heuristic choices that only partially capture the complexities of varying driving conditions. How to best fuse these sources of information remains an open research question. To address this, we introduce MAGNet, a Multi-Task Adaptive Gating Network for Trajectory Distilled Control Prediction. This framework employs a multi-task learning strategy to combine trajectory and direct control prediction. Our key insight is to design a gating network that learns how to optimally combine the outputs of trajectory and control predictions in each situation. Using the CARLA simulator, we evaluate MAGNet in closed-loop settings with challenging scenarios. Results show that MAGNet outperforms the state-of-the-art on two publicly available CARLA benchmarks, Town05 Long and Longest6.
Exploring Contextual Representation and Multi-Modality for End-to-End Autonomous Driving
arXiv preprint arXiv:2210.06758, 2024
Learning contextual and spatial environmental representations enhances autonomous vehicle’s hazard anticipation and decision-making in complex scenarios. Recent perception systems enhance spatial understanding with sensor
fusion but often lack full environmental context. Humans, when driving, naturally employ neural maps that integrate various factors such as historical
data, situational subtleties, and behavioral predictions of other road users
to form a rich contextual understanding of their surroundings. This neural
map-based comprehension is integral to making informed decisions on the
road. In contrast, even with their significant advancements, autonomous
systems have yet to fully harness this depth of human-like contextual understanding. Motivated by this, our work draws inspiration from human
driving patterns and seeks to formalize the sensor fusion approach within
an end-to-end autonomous driving framework. We introduce a framework
that integrates three cameras (left, right, and center) to emulate the human
field of view, coupled with top-down bird-eye-view semantic data to enhance
contextual representation. The sensor data is fused and encoded using a
self-attention mechanism, leading to an auto-regressive waypoint prediction
module. We treat feature representation as a sequential problem, employing a vision transformer to distill the contextual interplay between sensor
modalities. The efficacy of the proposed method is experimentally evaluated
in both open and closed-loop settings. Our method achieves displacement
error by 0.67m in open-loop settings, surpassing current methods by 6.9%
on the nuScenes dataset. In closed-loop evaluations on CARLA’s Town05
Long and Longest6 benchmarks, the proposed method enhances driving performance, route completion, and reduces infractions.
Multimodal fusion for sensorimotor control in steering angle prediction
Engineering Applications of Artificial Intelligence, 2023
Imitation learning is employed to learn sensorimotor coordination for steering angle prediction in an end-to-end fashion requires expert demonstrations. These expert demonstrations are paired with environmental perception and vehicle control data. The conventional frame-based RGB camera is the most common exteroceptive sensor modality used to acquire the environmental perception data. The frame-based RGB camera has produced promising results when used as a single modality in learning end-to-end lateral control. However, the conventional frame-based RGB camera has limited operability in illumination variation conditions and is affected by the motion blur. The event camera provides complementary information to the frame-based RGB camera. This work explores the fusion of frame-based RGB and event data for learning end-to-end lateral control by predicting steering angle. In addition, how the representation from event data fuse with frame-based RGB data helps to predict the lateral control robustly for the autonomous vehicle. To this end, we propose DRFuser, a novel convolutional encoder-decoder architecture for learning end-to-end lateral control. The encoder module is branched between the frame-based RGB data and event data along with the self-attention layers. Moreover, this study has also contributed to our own collected dataset comprised of event, frame-based RGB, and vehicle control data. The efficacy of the proposed method is experimentally evaluated on our collected dataset, Davis Driving dataset (DDD), and Carla Eventscape dataset. The experimental results illustrate that the proposed method DRFuser outperforms the state-of-the-art in terms of root-mean-square error (RMSE) and mean absolute error (MAE) used as evaluation metrics.
Exploring thermal images for object detection in underexposure regions for autonomous driving
Farzeen Munir, Shoaib Azam, Muhammd Aasim Rafique, Ahmad Muqeem Sheri, Moongu Jeon, and Witold Pedrycz
Applied Soft Computing, 2022
Underexposure regions are vital in constructing a complete perception of the surrounding environment for safe autonomous driving. The availability of thermal cameras has provided an essential alternative to explore regions where other optical sensors lack in capturing interpretable signals. A thermal camera captures an image using the heat difference emitted by objects in the infrared spectrum, and object detection in thermal images becomes effective for autonomous driving in challenging conditions. Although object detection in the visible spectrum domain has matured, thermal object detection lacks effectiveness. A significant challenge is the scarcity of labeled data for the thermal domain, which is essential for SOTA artificial intelligence techniques. This work proposes a domain adaptation framework that employs a style transfer technique for transfer learning from visible spectrum images to thermal images. The framework uses a generative adversarial network (GAN) to transfer the low-level features from the visible spectrum domain to the thermal domain through style consistency. The efficacy of the proposed object detection method in thermal images is evident from the improved results when using styled images from publicly available thermal image datasets (FLIR ADAS and KAIST Multi-Spectral).
Drivable region estimation for self-driving vehicles using radar
Muhammad Ishfaq Hussain, Shoaib Azam, Muhammad Aasim Rafique, Ahmad Muqeem Sheri, and Moongu Jeon
IEEE Transactions on Vehicular Technology, 2022
Self-driving vehicles are posing new challenges as the automation level defined in the SAE International standards for autonomous driving is increased. A pivotal task in autonomous driving is building a perception of the surrounding environment using optical sensors, which is a long-standing challenge and prompts us to explore the utilization of various sensors. Radar is an older and cheaper type of sensor than alternatives such as lidar for long-range distance coverage, and it is also competitively reliable and robust in adverse weather conditions. However, sparse data and noise are inherent challenges of radar. This study explores the dynamic Gaussian process for occupancy mapping and predicting a drivable path for a self-driving vehicle within the field of view (FOV) of a radar sensor. Gaussian occupancy mapping does not need abundant data for training and is a promising alternative to data-reliant deep learning techniques. The proposed technique optimizes parameters (variational and kernel-based) of the Gaussian process to determine the allowed region within the FOV limits by means of stochastic selection of functional points (pseudoinput) and tuning of threshold values. We have tested the proposed technique in experiments performed under different environmental conditions, such as various road and traffic conditions and diverse weather and illumination conditions. The results verify the efficacy of the proposed technique in diverse weather conditions for finding a drivable path for a self-driving vehicle, with the additional benefits of requiring only a low-cost apparatus and providing coverage of a long distance range.
N 2 C: neural network controller design using behavioral cloning
Shoaib Azam, Farzeen Munir, Muhammad Aasim Rafique, Ahmad Muqeem Sheri, Muhammad Ishfaq Hussain, and Moongu Jeon
IEEE Transactions on Intelligent Transportation Systems, 2021
Modern vehicles communicate data to and from sensors, actuators, and electronic control units (ECUs) using Controller Area Network (CAN) bus, which operates on differential signaling. An autonomous ECU responsible for the execution of decision commands to an autonomous vehicle is developed by assimilating the information from the CAN bus. The conventional way of parsing the decision commands is motion planning, which uses a path tracking algorithm to evaluate the decision commands. This study focuses on designing a robust controller using behavioral cloning and motion planning of autonomous vehicle using a deep learning framework. In the first part of this study, we explore the pipeline of parsing decision commands from the path tracking algorithm to the controller and proposed a neural network-based controller ( N 2 C) using behavioral cloning. The proposed network predicts throttle, brake, and torque when trained with the manual driving data acquired from the CAN bus. The efficacy of the proposed method is demonstrated by comparing the accuracy with the Proportional-Derivative-Integral (PID) controller in conjunction with the path tracking algorithm (pure pursuit and model predictive control based path follower). The second part of this study complements N 2 C, in which an end-to-end neural network for predicting the speed and steering angle is proposed with image data as an input. The performance of the proposed frameworks are evaluated in real-time and on the Udacity dataset, showing better metric scores in the former and reliable prediction in the later case when compared with the state-of-the-art methods.
Key points estimation and point instance segmentation approach for lane detection
IEEE Transactions on Intelligent Transportation Systems, 2021
Perception techniques for autonomous driving should be adaptive to various environments. In essential perception modules for traffic line detection, many conditions should be considered, such as a number of traffic lines and computing power of the target system. To address these problems, in this paper, we propose a traffic line detection method called Point Instance Network (PINet); the method is based on the key points estimation and instance segmentation approach. The PINet includes several hourglass models that are trained simultaneously with the same loss function. Therefore, the size of the trained models can be chosen according to the target environment’s computing power. We cast a clustering problem of the predicted key points as an instance segmentation problem; the PINet can be trained regardless of the number of the traffic lines. The PINet achieves competitive accuracy and false positive on CULane and TuSimple datasets, popular public datasets for lane detection. Our code is available at https://github.com/koyeongmin/PINet_new
LDNet: End-to-end lane marking detection approach using a dynamic vision sensor
IEEE Transactions on Intelligent Transportation Systems, 2021
Modern vehicles are equipped with various driver-assistance systems, including automatic lane keeping, which prevents unintended lane departures. Traditional lane detection methods incorporate handcrafted or deep learning-based features followed by postprocessing techniques for lane extraction using frame-based RGB cameras. The utilization of frame-based RGB cameras for lane detection tasks is prone to illumination variations, sun glare, and motion blur, which limits the performance of lane detection methods. Incorporating an event camera for lane detection tasks in the perception stack of autonomous driving is one of the most promising solutions for mitigating challenges encountered by frame-based RGB cameras. The main contribution of this work is the design of the lane marking detection model, which employs the dynamic vision sensor. This paper explores the novel application of lane marking detection using an event camera by designing a convolutional encoder followed by the attention-guided decoder. The spatial resolution of the encoded features is retained by a dense atrous spatial pyramid pooling (ASPP) block. The additive attention mechanism in the decoder improves performance for high dimensional input encoded features that promote lane localization and relieve postprocessing computation. The efficacy of the proposed work is evaluated using the DVS dataset for lane extraction (DET). The experimental results show a significant improvement of 5.54% and 5.03% in F1 scores in multiclass and binary-class lane marking detection tasks. Additionally, the intersection over union ( IoU ) scores of the proposed method surpass those of the best-performing state-of-the-art method by 6.50% and 9.37% in multiclass and binary-class tasks, respectively.
Transfer learning for vehicle detection using two cameras with different focal lengths
This paper proposes a vehicle detection method using transfer learning for two cameras with different focal lengths. A detected vehicle region in an image of one camera is transformed into a binary map. After that, the map is used to filter convolutional neural network (CNN) feature maps which are computed for the other camera’s image. We also introduce a robust evolutionary algorithm that is used to compute the relationship between the two cameras in an off-line mode efficiently. We capture video sequences and sample them to make a dataset that includes images with different focal lengths for vehicle detection. We compare the proposed vehicle detection method with baseline detection methods, including faster region proposal networks (Faster-RCNN), single-shot-multi-Box detector (SSD), and detector using recurrent rolling convolution (RRC), in the same experimental context. The experimental results show that the proposed method can detect vehicles at a wide range of distances accurately and robustly, and significantly outperforms the baseline detection methods.
System, design and experimental validation of autonomous vehicle in an unconstrained environment
Shoaib Azam, Farzeen Munir, Ahmad Muqeem Sheri, Joonmo Kim, and Moongu Jeon
Sensors, 2020
In recent years, technological advancements have made a promising impact on the development of autonomous vehicles. The evolution of electric vehicles, development of state-of-the-art sensors, and advances in artificial intelligence have provided necessary tools for the academia and industry to develop the prototypes of autonomous vehicles that enhance the road safety and traffic efficiency. The increase in the deployment of sensors for the autonomous vehicle, make it less cost-effective to be utilized by the consumer. This work focuses on the development of full-stack autonomous vehicle using the limited amount of sensors suite. The architecture aspect of the autonomous vehicle is categorized into four layers that include sensor layer, perception layer, planning layer and control layer. In the sensor layer, the integration of exteroceptive and proprioceptive sensors on the autonomous vehicle are presented. The perception of the environment in term localization and detection using exteroceptive sensors are included in the perception layer. In the planning layer, algorithms for mission and motion planning are illustrated by incorporating the route information, velocity replanning and obstacle avoidance. The control layer constitutes lateral and longitudinal control for the autonomous vehicle. For the verification of the proposed system, the autonomous vehicle is tested in an unconstrained environment. The experimentation results show the efficacy of each module, including localization, object detection, mission and motion planning, obstacle avoidance, velocity replanning, lateral and longitudinal control. Further, in order to demonstrate the experimental validation and the application aspect of the autonomous vehicle, the proposed system is tested as an autonomous taxi service.
Saliency Based Object Detection and Enhancements Using Spectral Residual Approach in Static Images and Videos
Muhammad Shoaib Azam, Syed Omer Gilani, Mohsin Jamil, Yasar Ayaz, Muhammad Naveed, and Muhammad Nasir Khan
Advanced Science Letters, 2015
Salient feature extraction in images and videos are of high concern from the aspect of object detection.
Today there are many techniques which are used to extract the salient features. Salient features are basically the
most attention taking features seen by the human eye. In the frequency domain the most appropriate method is the
spectral residual approach using the Phase Fourier Transform (PFT) which gives better result than other techniques.
In this paper we are implementing the spectral residual method using the Phase Fourier Transform to find out the
salient areas. These results have been immensely improved by applying edge detection techniques and
morphological operations. To make the object detectable sobel operator and dilation is used. After applying we get
better results and a very clean view of the salient areas and in some cases we almost prove total object detection.
Furthermore PFT is implemented on videos and for object detection sobel operator and dilation is applied on the
results given by PFT. Finally Area under the Receiver Operating Characteristics (AUC) is calculated for both
images and videos.
Patents
Face de-identification method (translated), Registration #: 10-1861520
Muhammad Aasim Rafique, Shoaib Azam, InMoon Choi, and Moongu Jeon
2017
Conferences & Workshops
Challenges of Data-Driven Simulation of Diverse and Consistent Human Driving Behaviors
Kalle Kujanpää, Daulet Baimukashev, Shibei Zhu, Shoaib Azam, Farzeen Munir, Gokhan Alcan, and Ville Kyrki
In , 2024
Building simulation environments for developing and testing autonomous vehicles necessitates that the simulators accurately model the statistical realism of the real-world environment, including the interaction with other vehicles driven by human drivers. To address this requirement, an accurate human behavior model is essential to incorporate the diversity and consistency of human driving behavior. We propose a mathematical framework for designing a data-driven simulation model that simulates human driving behavior more realistically than the currently used physics-based simulation models. Experiments conducted using the NGSIM dataset validate our hypothesis regarding the necessity of considering the complexity, diversity, and consistency of human driving behavior when aiming to develop realistic simulators.
Exploring Large Language Models for Trajectory Prediction: A Technical Perspective
Farzeen Munir, Tsvetomila Mihaylova, Shoaib Azam, Tomasz Piotr Kucner, and Ville Kyrki
In Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, 2024
Large Language Models (LLMs) have been recently proposed for trajectory prediction in autonomous driving, where they potentially can provide explainable reasoning capability about driving situations. Most studies use versions of the OpenAI GPT, while there are open-source alternatives which have not been evaluated in this context. In this report1, we study their trajectory prediction performance as well as their ability to reason about the situation. Our results indicate that open-source alternatives are feasible for trajectory prediction. However, their ability to describe situations and reason about potential consequences of actions appears limited, and warrants future research.
Radar-Lidar Fusion for Object Detection by Designing Effective Convolution Networks
Farzeen Munir, Shoaib Azam, Tomasz Kucner, Ville Kyrkil, and Moongu Jeon
In 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), 2023
Object detection is a core component of perception systems, providing the ego vehicle with information about its surroundings to ensure safe route planning. While cameras and Lidar have significantly advanced perception systems, their performance can be limited in adverse weather conditions. In contrast, millimeter-wave technology enables radars to function effectively in such conditions. However, relying solely on radar for building a perception system doesn’t fully capture the environment due to the data’s sparse nature. To address this, sensor fusion strategies have been introduced. We propose a dual-branch framework to integrate radar and Lidar data for enhanced object detection. The primary branch focuses on extracting radar features, while the auxiliary branch extracts Lidar features. These are then combined using additive attention. Subsequently, the integrated features are processed through a novel Parallel Forked Structure (PFS) to manage scale variations. A region proposal head is then utilized for object detection. We evaluated the effectiveness of our proposed method on the Radiate dataset using COCO metrics. The results show that it surpasses state-of-the-art methods by 1.89% and 2.61% in favorable and adverse weather conditions, respectively. This underscores the value of radar-Lidar fusion in achieving precise object detection and localization, especially in challenging weather conditions.
Channel boosting feature ensemble for radar-based object detection
Shoaib Azam, Farzeen Munir, and Moongu Jeon
In 2021 IEEE Intelligent Vehicles Symposium (IV), 2021
Autonomous vehicles are conceived to provide safe and secure services by validating the safety standards as indicated by SOTIF-ISO/PAS-21448 (Safety of the intended functionality). Keeping in this context, the perception of the environment plays an instrumental role in conjunction with localization, planning and control modules. As a pivotal algorithm in the perception stack, object detection provides extensive insights into the autonomous vehicle’s surroundings. Camera and Lidar are extensively utilized for object detection among different sensor modalities, but these exteroceptive sensors have limitations in resolution and adverse weather conditions. In this work, radar-based object detection is explored provides a counterpart sensor modality to be deployed and used in adverse weather conditions. The radar gives complex data; for this purpose, a channel boosting feature ensemble method with transformer encoder-decoder network is proposed. The object detection task using radar is formulated as a set prediction problem and evaluated on the publicly available dataset [1] in both good and good-bad weather conditions. The proposed method’s efficacy is extensively evaluated using the COCO evaluation metric, and the best-proposed model surpasses its state-of-the-art counterpart method by 12.55% and 12.48% in both good and good-bad weather conditions.
Sstn: Self-supervised domain adaptation thermal object detection for autonomous driving
Farzeen Munir, Shoaib Azam, and Moongu Jeon
In 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS), 2021
The perception of the environment plays a decisive role in the safe and secure operation of autonomous vehicles. The perception of the surrounding is way similar to human vision. The human’s brain perceives the environment by utilizing different sensory channels and develop a view-invariant representation model. In this context, different exteroceptive sensors like cameras, Lidar, are deployed on the autonomous vehicle to perceive the environment. These sensors have illustrated their benefit in the visible spectrum domain yet in the adverse weather conditions; for instance, they have limited operational capability at night, leading to fatal accidents. This work explores thermal object detection to model a view-invariant model representation by employing the self-supervised contrastive learning approach. We have proposed a deep neural network Self Supervised Thermal Network (SSTN) for learning the feature embedding to maximize the information between visible and infrared spectrum domain by contrastive learning. Later, these learned feature representations are employed for thermal object detection using a multi-scale encoder-decoder transformer network. The proposed method is extensively evaluated on the two publicly available datasets: the FLIR-ADAS dataset and the KAIST Multi-Spectral dataset. The experimental results illustrate the efficacy of the proposed method.
ARTSEG: Employing attention for thermal images semantic segmentation
Farzeen Munir, Shoaib Azam, Unse Fatima, and Moongu Jeon
In Asian Conference on Pattern Recognition, 2021
The research advancements have made the neural network algorithms deployed in the autonomous vehicle to perceive the surrounding. The standard exteroceptive sensors that are utilized for the perception of the environment are cameras and Lidar. Therefore, the neural network algorithms developed using these exteroceptive sensors have provided the necessary solution for the autonomous vehicle’s perception. One major drawback of these exteroceptive sensors is their operability in adverse weather conditions, for instance, low illumination and night conditions. The useability and affordability of thermal cameras in the sensor suite of the autonomous vehicle provide the necessary improvement in the autonomous vehicle’s perception in adverse weather conditions. The semantics of the environment benefits the robust perception, which can be achieved by segmenting different objects in the scene. In this work, we have employed the thermal camera for semantic segmentation. We have designed an attention-based Recurrent Convolution Network (RCNN) encoder-decoder architecture named ARTSeg for thermal semantic segmentation. The main contribution of this work is the design of encoder-decoder architecture, which employ units of RCNN for each encoder and decoder block. Furthermore, additive attention is employed in the decoder module to retain high-resolution features and improve the localization of features. The efficacy of the proposed method is evaluated on the available public dataset, showing better performance with other state-of-the-art methods in mean intersection over union (IoU).
Dynamic Control System Design for Autonomous Car
Shoaib Azam, Farzeen Munir, and Moongu Jeon
In VEHITS, 2020
The autonomous vehicle requires higher standards of safety to maneuver in a complex environment. We focus on control of the self-driving vehicle that includes the longitudinal and lateral dynamics of the vehicle. In this work, we have developed a customized controller for our KIA Soul self-driving car. The customized controller implements the PID control for throttle, brake, and steering so that the vehicle follows the desired velocity profile, which enables a comfortable and safe ride. Besides, we have also catered the lateral dynamic model with two approaches: pure pursuit and model predictive control. An extensive analysis is performed between pure pursuit and its adversary model predictive control for the efficacy of the lateral model.
Multiple objects tracking using radar for autonomous driving
Muhamamd Ishfaq Hussain, Shoaib Azam, Farzeen Munir, Zafran Khan, and Moongu Jeon
In 2020 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), 2020
Object detection and tracking are the integral elements for the perception of the spatio-temporal environment. The availability and affordability of camera and lidar as the leading sensor modalities have used for object detection and tracking in research. The usage of deep learning algorithms for the object detection and tracking using camera and lidar have illustrated the promising results, but these sensor modalities are prone to weather conditions, have sparse data and spatial resolution problems. In this work, we explore the problem of detecting distant objects and tracking using radar. For the efficacy of our proposed work, extensive experimentation in different traffic scenario are performed by using our self-driving car test-bed.
Visuomotor Steering angle Prediction in Dynamic Perception Environment for Autonomous Vehicle
Farzeen Munir, Shoaib Azam, and Moongu Jeon
In 2020 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), 2020
Visuomotor coordination in driving assists the driver in performing necessary action expeditiously. Over the recent years, the development towards autonomous vehicles has accelerated tremendously, and an increase in the computational capabilities motivates us to train the complex deep neural network models using visual cues for better understanding of visuomotor coordination in the autonomous vehicles. In this work, we exploit this problem by using visual cues in the images to predict the steering angle for a self-driving car. The future frames are predicted to estimate a look-ahead steering angle so that self-driving car can make appropriate decisions. In our dynamic predictive model, we have used the deep convolution-LSTM model for predicting the future frames and a series of ResNet blocks for predicting the steering angle for the self-driving car. Moreover, we have used segmentation as an auxiliary information for training the dynamic predictive network. The efficacy of our dynamic predictive model is rigorously tested on our collected and Udacity dataset.
Data fusion of lidar and thermal camera for autonomous driving
Shoaib Azam, Farzeen Munir, Ahmad Muqeem Sheri, YeongMin Ko, Ishfaq Hussain, and Moongu Jeon
In Applied Industrial Optics: Spectroscopy, Imaging and Metrology, 2019
The adverse environmental conditions build a bottleneck for the autonomous driving. This challenge is resolved by data fusion of sensor modalities. Here, thermal and Lidar data are fused together for the precise perception of environment.
Automated Taxi Booking Operations for Autonomous Vehicles
Linh Van Ma, Shoaib Azam, Farzeen Munir, Moongu Jeon, and Jinho Choi
In 2019 13th International Conference on Signal Processing and Communication Systems (ICSPCS), 2019
In a conventional taxi booking system, all taxi operations are mostly done by a decision made by drivers which is hard to implement in unmanned vehicles. To address this challenge, we introduce a taxi booking system which assists autonomous vehicles to pick up customers. The system can allocate an autonomous vehicle (AV) as well as plan service trips for a customer request. We use our own AV to serve a customer who uses a mobile application to make his taxi request. Apart from customer and AV, we build a server to monitor customers and AVs. It also supports inter-communication between a customer and an AV once AV decided to pick up a customer.
Where Am I: Localization and 3D Maps for Autonomous Vehicles
Farzeen Munir, Shoaib Azam, Ahmad Muqeem Sheri, YeongMin Ko, and Moongu Jeon
In VEHITS, 2019
The nuts and bolts of autonomous driving find its root in devising the localization strategy. Lidar as one of the newest technologies developed in the recent years, provides rich information about the environment in the form of point cloud data which can be used for localization. In this paper, we discuss a localization approach which generates a 3D map from Lidar’s point cloud data using Normal Distribution Transform (NDT) mapping. We use our own dataset collected using our self driving car KIA Soul EV equipped with Lidar and cameras. Once the 3D map has been generated, we have used NDT matching for localizing the self driving car.
Object modeling from 3d point cloud data for self-driving vehicles
Shoaib Azam, Farzeen Munir, Aasim Rafique, YeongMin Ko, Ahmad Muqeem Sheri, and Moongu Jeon
In 2018 IEEE Intelligent Vehicles Symposium (IV), 2018
For autonomous vehicles to be deployed and used practically, many problems are still needed to be solved. One of them we are interested in is to make use of a cheap LIDAR for robust object modelling with 3D point cloud data. Self-driving vehicles require accurate information about the surrounding environments to decide the next course of actions. 3D point cloud data obtained from LIDAR give more accurate distance than the counterpart stereo images. As LIDAR generates lowresolution data, the object detection and modeling is prone to produce errors. In this work, we propose the use of multiple frames of LIDAR data in an urban environment to construct a comprehensive model of the object. We assume the use of LIDAR on a moving platform and the results are almost equal to the 3D CAD model representation of the object.
Autonomous vehicle: The architecture aspect of self driving car
Farzeen Munir, Shoaib Azam, Muhammad Ishfaq Hussain, Ahmed Muqeem Sheri, and Moongu Jeon
In Proceedings of the 2018 International Conference on Sensors, Signal and Image Processing, 2018
Self-driving cars have received a lot of attention in recent years and many stakeholders like Google, Uber, Tesla, and so forth have invested a lot in this area and developed their own autonomous driving car platforms. The challenge to make an autonomous car is not only the stringent performance but also the safety of the passengers and pedestrians. Even with the development of technologies, autonomous driving is still an active research area and still requires a lot of experimentations and making architecture entirely autonomous.
The intriguing area of self-driving car motivates us to build an autonomous driving platform. In this paper, we discuss the architecture of the self-driving car and its software components that include localization, detection, motion planning and mission planning. We also highlight the hardware modules that are responsible for controlling the car. The autonomous driving is running state-of-the-art algorithms used in localization, detection, mission and motion planning.
A Benchmark of Computational Models of Saliency to Predict Human Fixations in Videos.
Shoaib Azam, Syed Omer Gilani, Moongu Jeon, Rehan Yousaf, and Jeong-Bae Kim
In VISIGRAPP (4: VISAPP), 2016
In many applications of computer graphics and design, robotics and computer vision, there is always a need
to predict where human looks in the scene. However this is still a challenging task that how human visual
system certainly works. A number of computational models have been designed using different approaches
to estimate the human visual system. Most of these models have been tested on images and performance is
calculated on this basis. A benchmark is made using images to see the immediate comparison between the
models. Apart from that there is no benchmark on videos, to alleviate this problem we have a created a
benchmark of six computational models implemented on 12 videos which have been viewed by 15 observers
in a free viewing task. Further a weighted theory (both manual and automatic) is designed and implemented
on videos using these six models which improved Area under the ROC. We have found that Graph Based
Visual Saliency (GBVS) and Random Centre Surround Models have outperformed the other models.
Vehicle pose detection using region based convolutional neural network
Shoaib Azam, Aasim Rafique, and Moongu Jeon
In 2016 International Conference on Control, Automation and Information Sciences (ICCAIS), 2016
In recent years, category-level object detection has gained a lot of attention. In addition to object localization, estimation of the object pose has practical applications in intelligent transportation, autonomous driving and robotics. Parts based models have been used for pose estimation in recent years, but these models depend on manual supervision or require a complex algorithm to locate the object parts. In this work, we have used Convolutional Neural Network for the pose estimation of vehicle in an image. The advantage of multiple classifications of objects at the same time motivates us to choose the convolutional neural network. We make use of state-of-the-art implementation of convolution neural network named the Region Based Convolutional Neural Network(FASTER-RCNN) for estimating the pose of vehicle. We annotate the comprehensive cars dataset of Stanford, required for training the model and upon testing we have achieved good results with good accuracy.
Face-deidentification in images using restricted boltzmann machines
M Aasim Rafique, M Shoaib Azam, Moongu Jeon, and Sangwook Lee
In 2016 11th International Conference for Internet Technology and Secured Transactions (ICITST), 2016
In this work, we discuss utility of Restricted Boltzmann Machine (RBM) in face-deidentification challenge. GRBM is a generative modeling technique and its unsupervised learning provides vantage of using raw faces data. Faces are deidentified by reconstructed face images from the trained GRBM model. The reconstructed image uses random information from the stochastic units which makes it hard to re-identify from the deidentified face. Experiments show the proposed technique maintain emotions in the test face, which is intrinsic to the modeling capacity of RBM.
Single object tracking system using fast compressive tracking
Abdullah Tahir, Shoaib Azam, Sujani Sagabala, Moongu Jeon, and Ryu Jeha
In 2016 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), 2016
In this work we focused on the application aspect of object tracking for pan-tilt-zoom (PTZ) camera using ordinary webcam mounted on custom made motor-assembly and found that our system is not only robust to illumination conditions but also cost-effective in comparison with PTZ cameras. For object tracking we utilized Fast Compressive Tracking (FCT) algorithm because of its attractive features e.g. online learning, fast computation and robust performance. A PC program interfaced with embedded system through serial RS232 commands motors, hence camera, to real time track desired object in world such that object being tracked remains in the center of image.