I am a PhD candidate and software engineer who is passionate about turning challenges into opportunities with inspired data. My research focuses on applying machine learning techniques to improve big data management systems, especially in spatial databases. Here is my CV.
Update: I have graduated with a PhD degree and joined Microsoft as an Applied Scientist.
Spatial Partitioning using Deep Learning: utilize the power of deep learning to build a model that can predict the best partitioning technique for a given spatial dataset. Source Code.
Query Optimization using Deep Learning: explore the capabilities of deep learning in the context of query optimization.
Indexing Techniques for Big Spatial Data: build a big data management system which fully supports spatial data processing. Source Code.
CS 014 - Introduction to Data Structures and Algorithms (Fall 2017, Summer 2018): CS 014 introduces the students to the fundamental data structures and algorithmic analysis techniques such as lists, stacks, queues, search trees, sorting algorithms, hash tables, and graphs.
CS 141 - Intermediate Data Structures and Algorithms (Winter 2018, Spring 2018): CS 141 provides the basic background for a computer scientist in the area of data structures and algorithms. During this course, students will learn problem solving skills, how to compare them, and how to apply them in real problems.
CS 218 - Design and Analysis of Algorithms (Fall 2018): Study of efficient data structures and algorithms for solving problems from a variety of areas such as sorting, searching, selection, linear algebra, graph theory, and computational geometry. Worst-case and average-case analysis using recurrence relations, generating functions, upper and lower bounds, and other methods.
CS 167 - Introduction to Big Data (Spring 2020): CS 167 covers the data management and systems aspects of big data platforms such as Hadoop, Spark, and AsterixDB. In this course, you will learn how the data is stored in a distributed file system and how the queries run in parallel.
Map & GeoSpatial Group, Microsoft AI & Research: explore how to leverage machine learning, deep learning as well as geospatial technology to improve the quality of Bing Maps geocoding system.
ArcGIS GeoDatabase Group: applied parallel processing techniques to improve the performance and scalability of Utility Network operations; won the 2nd Place and Best Presentation Award at ESRI Intern Hackathon.
R&D Division: developed a database system for the largest online game service in Vietnam with 5 million customers.
Emotion recognition system: developed a machine learning system to identify human emotion based on EEG signal.
Advisor: Dr. Ahmed Eldawy
I also have collaboration projects with Dr. Vassilis Tsotras (UC Riverside), Dr. Vagelis Hristidis (UC Riverside) and Dr. Michael J. Carey (UC Irvine).
Please visit my Google Scholar profile for the most updated publications
Tin Vu, Ahmed Eldawy, Vagelis Hristidis and Vassilis J. Tsotras. "Incremental Partitioning for Efficient Spatial Data Analytics". Proceedings of the VLDB Endowment (PVLDB), Volume 15, Issue 3, 2022. DOI>10.14778/3494124.3494150 -- PDF
Tin Vu, Alberto Belussi, Sara Migliorini, and Ahmed Eldawy. "Towards a Learned Cost Model for Distributed Spatial Join: Data, Code & Models". In Proceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM), 2022. DOI>10.1145/3511808.3557712 -- PDF
Tin Vu, Alberto Belussi, Sara Migliorini, and Ahmed Eldawy. "A Learned Query Optimizer for Spatial Join", In ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2021, 10 pages , 2021. DOI>10.1145/3474717.3484217 -- PDF
Ahmed Eldawy, Vagelis Hristidis, Saheli Ghosh, Majid Saeedan, Akil Sevim, A.B. Siddique, Samriddhi Singla, Ganesh Sivaram, Tin Vu, and Yaming Zhang. "Beast: Scalable Exploratory Analytics on Spatio-temporal Data", In International Conference on Information and Knowledge Management (CIKM), 12 pages, 2021. DOI>10.1145/3459637.3481897 -- PDF
Tin Vu, Solluna Liu, Renzhong Wang, and Kumarswamy Valegerepura. "Noise Prediction for Geocoding Queries using Word Geospatial Embedding and Bidirectional LSTM", In Proceedings of the 28th International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2020, November, 2020. DOI>10.1145/3397536.3422201 -- PDF
Puloma Katiyar, Tin Vu, Sara Migliorini, Alberto Belussi, and Ahmed Eldawy. "SpiderWeb: A Spatial Data Generator on the Web", In 28th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2020, November, 2020. DOI>10.1145/3397536.3422351 -- PDF
Tin Vu, Alberto Belussi, Sara Migliorini, and Ahmed Eldawy. "Using Deep Learning for Big Spatial Data Partitioning", In ACM Transactions on Spatial Algorithms and Systems (TSAS), 2020. DOI>10.1145/3402126 -- PDF
Tin Vu and Ahmed Eldawy, "DeepSampling: Selectivity Estimation with Predicted Error and Response Time", DeepSpatial2020, 1st ACM SIGKDD Workshop on Deep Learning for Spatiotemporal Data, Applications, and Systems. DOI>10.1145/0000000.0000000 -- PDF
Tin Vu and Ahmed Eldawy. "R*-Grove: Balanced Spatial Partitioning for Large-Scale Datasets", In Frontiers in Big Data, August, 2020. DOI>10.3389/fdata.2020.00028 -- PDF
Saheli Ghosh, Tin Vu, Mehrad Amin Eskandari and Ahmed Eldawy, "UCR-STAR: the UCR spatio-temporal active repository", SIGSPATIAL Special 11, no. 2 (2019): 34-40. DOI>10.1145/3377000.3377005 -- PDF
Tin Vu, Sara Migliorini, Ahmed Eldawy, and Alberto Bulussi. "Spatial Data Generators", In 1st ACM SIGSPATIAL International Workshop on Spatial Gems (SpatialGems 2019), 2019. Best Paper Award. DOI>10.1145/0000000.0000000 -- PDF
Tin Vu, "Deep Query Optimization", In Proceedings of the 2019 International Conference on Management of Data (pp. 1856-1858). DOI>10.1145/3299869.3300104 -- PDF
Tin Vu and Ahmed Eldawy. "R-Grove: growing a family of R-trees in the big-data forest", In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, (SIGSPATIAL 2018), November, Seattle, WA, pages 532-535, 2018. DOI>10.1145/3274895.3274984 -- PDF
Thanh Nguyen Trung, Tin Vu, Minh Nguyen, "BFC: High performance distributed big file cloud storage based on key value store", 16th IEEE/ ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/ Distributed Computing, Takamatsu, Japan, 06/2015. DOI>10.1109/SNPD.2015.7176209 -- PDF
I love running. I run ~3 miles every day.
I also like books, especially historical books. This is my reading list.
Other links: Ha Tran, the most incredible female vocalist at Vietnam.