Publications
Authors marked with * are the corresponding authors. Underline indicates that the author is/was my student/Postdoc. Papers marked with ** use alphabetic ordering of authors, following the convention of theoretical computer science.
Journal Articles
-
A Survey of Geometric Graph Neural Networks: Data Structures, Models and ApplicationsTo appear in Frontiers of Computer Science (FCS), 2025[arXiv]
-
-
Scalable and Effective Graph Neural Networks via Trainable Random Walk SamplingIEEE Transactions on Knowledge and Data Engineering (TKDE), 2025
-
-
When Transformer Meets Large Graphs: An Expressive and Efficient Two-View ArchitectureIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
-
-
A Survey on Large Language Model Based Autonomous AgentsFrontiers of Computer Science (FCS), 2024[arXiv]
-
-
Efficient Algorithms for Personalized PageRank Computation: A SurveyIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024[arXiv]
-
-
Enabling Efficient Random Access to Hierarchically Compressed Text Data on Diverse GPU PlatformsIEEE Transactions on Parallel and Distributed Systems (TPDS), 2023
-
-
Influence Maximization Revisited: Efficient Sampling with Bound TightenedACM Transactions on Database Systems (TODS), 2022
-
-
Building Graphs at Scale via Sequence of Edges: Model and Generation AlgorithmsIEEE Transactions on Knowledge and Data Engineering (TKDE), 2022[Code]
-
-
ExactSim: Benchmarking Single-Source SimRank Algorithms with High-Precision Ground TruthsThe VLDB Journal, 2021[Code]
-
-
Efficient Algorithms for Approximate Single-Source Personalized PageRank QueriesACM Transactions on Database Systems (TODS), 2019
-
-
Parallel Trajectory-to-Location JoinIEEE Transactions on Knowledge and Data Engineering (TKDE), 2019
-
-
Distribution-Aware Crowdsourced Entity CollectionIEEE Transactions on Knowledge and Data Engineering (TKDE), 2019
-
-
Tight Space Bounds for Two-Dimensional Approximate Range CountingACM Transactions on Algorithms (TALG), 2018
-
-
Optimal Algorithms for Selecting Top-k Combinations of Attributes: Theory and ApplicationsThe VLDB Journal, 2017
-
-
Dynamic Shortest Path Monitoring in Spatial NetworksJournal of Computer Science and Technology (JCST), 2016
-
-
Collective Travel Planning in Spatial NetworksIEEE Transactions on Knowledge and Data Engineering (TKDE), 2015
-
Conference Articles
-
**Simple and Optimal Algorithms for Heavy Hitters and Frequency Moments in Distributed ModelsAnnual ACM Symposium on Theory of Computing (STOC), 2025
-
-
Scalable and Certifiable Graph Unlearning: Overcoming the Approximation Error BarrierInternational Conference on Learning Representations (ICLR), 2025. (Spotlight, 5.1%)[arXiv]
-
-
TGB-Seq Benchmark: Challenging Temporal GNNs with Complex Sequential DynamicsInternational Conference on Learning Representations (ICLR), 2025
-
-
Advancing Retrosynthesis with Retrieval-Augmented Graph GenerationTo appear in AAAI Conference on Artificial Intelligence (AAAI), 2025. (Oral)
-
-
Large-Scale Spectral Graph Neural Networks via Laplacian SparsificationTo appear in ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2025[arXiv]
-
-
Intruding with Words: Towards Understanding Graph Injection Attacks at the Text LevelAnnual Conference on Neural Information Processing Systems (NeurIPS), 2024
-
-
S-MolSearch: 3D Semi-supervised Contrastive Learning for Bioactive Molecule SearchAnnual Conference on Neural Information Processing Systems (NeurIPS), 2024
-
-
SRAP-Agent: Simulating and Optimizing Scarce Resource Allocation Policy with LLM-Based AgentConference on Empirical Methods in Natural Language Processing (EMNLP Findings), 2024
-
-
Beyond Over-smoothing: Uncovering the Trainability Challenges in Deep Graph Neural NetworksACM International Conference on Information and Knowledge Management (CIKM), 2024[arXiv]
-
-
Federated Heterogeneous Contrastive Distillation for Molecular Representation LearningACM International Conference on Information and Knowledge Management (CIKM), 2024
-
-
PolyFormer: Scalable Node-wise Filters via Polynomial Graph TransformerACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2024
-
-
EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site PredictionInternational Conference on Machine Learning (ICML), 2024. (Oral)
-
-
Optimal Matrix Sketching over Sliding WindowsInternational Conference on Very Large Data Bases (VLDB), 2024. (Best Paper Nomination)
-
-
HierAffinity: Predicting Protein-Ligand Binding Affinity With Hierarchical ModelingInternational Conference on Database Systems for Advanced Applications (DASFAA), 2024
-
-
TransPocket: Structural and Geometric Transformer for Ligand Binding Site DetectionInternational Conference on Database Systems for Advanced Applications (DASFAA), 2024
-
-
Learning-based Property Estimation with PolynomialsACM Conference on Management of Data (SIGMOD), 2024[Code]
-
-
**Revisiting Local Computation of PageRank: Simple and OptimalAnnual ACM Symposium on Theory of Computing (STOC), 2024
-
-
Exploring Neural Scaling Law and Data Pruning Methods For Node Classification on Large-scale GraphsThe Web Conference (WWW), 2024. (Oral)[Code]
-
-
Spectral Heterogeneous Graph Convolutions via Positive Noncommutative PolynomialsThe Web Conference (WWW), 2024. (Oral)
-
-
PolyGCL: GRAPH CONTRASTIVE LEARNING via Learnable Spectral Polynomial FiltersInternational Conference on Learning Representations (ICLR), 2024. (Spotlight)[Code]
-
-
**Approximating Single-Source Personalized PageRank with Absolute Error GuaranteesInternational Conference on Database Theory (ICDT), 2024[arXiv]
-
-
Do Deep Learning Methods Really Perform Better in Molecular Conformation Generation?International Conference on Learning Representations (ICLR), 2023. (MLDD Oral)
-
-
Estimating Single-Node PageRank in \(\tilde{O}\left(\min\{d_t, \sqrt{m}\}\right)\)TimeInternational Conference on Very Large Data Bases (VLDB), 2023
-
-
MGNN: Graph Neural Networks Inspired by Distance Geometry ProblemACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023
-
-
Optimal Dynamic Subset Sampling: Theory and ApplicationsACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023
-
-
Clenshaw Graph Neural NetworksACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023
-
-
Graph Neural Networks with Learnable and Optimal Polynomial BasesInternational Conference on Machine Learning (ICML), 2023
-
-
**On Range Summary QueriesInternational Colloquium on Automata, Languages and Programming (ICALP), 2023[arXiv]
-
-
Decoupled Graph Neural Networks for Large Dynamic GraphsInternational Conference on Very Large Data Bases (VLDB), 2023
-
-
Uni-Mol: A Universal 3D Molecular Representation Learning FrameworkInternational Conference on Learning Representations (ICLR), 2023
-
-
Personalized PageRank on Evolving Graphs with an Incremental Index-Update SchemeACM Conference on Management of Data (SIGMOD), 2023[arXiv]
-
-
Preformer: Predictive Transformer with Multi-Scale Segment-Wise Correlations for Long-Term Time Series ForecastingIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
-
-
Evennet: Ignoring Odd-Hop Neighbors Improves Robustness of Graph Neural NetworksAnnual Conference on Neural Information Processing Systems (NeurIPS), 2022
-
-
Convolutional Neural Networks On Graphs With Chebyshev Approximation, RevisitedAnnual Conference on Neural Information Processing Systems (NeurIPS), 2022. (Oral)
-
-
Approximating Probabilistic Group Steiner Trees in GraphsInternational Conference on Very Large Data Bases (VLDB), 2022[Code]
-
-
MGMAE: Molecular Representation Learning by Reconstructing Heterogeneous Graphs with A High Mask RatioACM International Conference on Information and Knowledge Management (CIKM), 2022
-
-
Optimizing Random Access to Hierarchically-Compressed Data on GPUInternational Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2022
-
-
Sampling-based Estimation of the Number of Distinct Values in Distributed EnvironmentACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022
-
-
Graph Neural Networks with Node-wise ArchitectureACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022
-
-
Instant Graph Neural Networks for Dynamic GraphsACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022
-
-
Edge-based Local Push for Personalized PageRankInternational Conference on Very Large Data Bases (VLDB), 2022
-
-
Building Graphs at Scale via Sequence of Edges: Model and Generation AlgorithmsIEEE International Conference on Data Engineering (ICDE), 2022
-
-
Predicting Protein-Ligand Binding Affinity via Joint Global-Local Interaction ModelingIEEE International Conference on Data Mining (ICDM), 2022[arXiv]
-
-
Learning to be a Statistician: Learned Estimator for Number of Distinct ValuesInternational Conference on Very Large Data Bases (VLDB), 2021
-
-
BernNet: Learning Arbitrary Graph Spectral Filters via Bernstein ApproximationAnnual Conference on Neural Information Processing Systems (NeurIPS), 2021
-
-
Approximate Graph PropagationACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021
-
-
Graph Neural Networks Inspired by Classical Iterative AlgorithmsInternational Conference on Machine Learning (ICML), 2021. (Long Talk)
-
-
Massively Parallel Algorithms for Personalized PageRankInternational Conference on Very Large Data Bases (VLDB), 2021
-
-
Unifying the Global and Local Approaches: An Efficient Power Iteration with Forward PushACM Conference on Management of Data (SIGMOD), 2021
-
-
FlashP: An Analytical Pipeline for Real-time Forecasting of Time-Series Relational DataInternational Conference on Very Large Data Bases (VLDB), 2021[arXiv]
-
-
Scalable Graph Neural Networks via Bidirectional PropagationAnnual Conference on Neural Information Processing Systems (NeurIPS), 2020
-
-
SimTab: Accuracy-Guaranteed SimRank Queries Through Tighter Confidence Bounds and Multi-Armed BanditsInternational Conference on Very Large Data Bases (VLDB), 2020
-
-
Simple and Deep Graph Convolutional NetworksInternational Conference on Machine Learning (ICML), 2020. (World Artificial Intelligence Conference Youth Outstanding Paper Nomination Award)
-
-
Personalized PageRank to a Target Node, RevisitedACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020
-
-
Influence Maximization Revisited: Efficient Reverse Reachable Set Generation with Bound TightenedACM Conference on Management of Data (SIGMOD), 2020
-
-
Exact Single-Source SimRank Computation on Large GraphsACM Conference on Management of Data (SIGMOD), 2020
-
-
CrowdGame: A Game-Based Crowdsourcing System for Cost-Effective Data LabelingACM Conference on Management of Data (SIGMOD), 2019
-
-
Scalable Graph Embeddings via Sparse Transpose ProximitiesACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019. (Oral)
-
-
Efficient Estimation of Heat Kernel PageRank for Local ClusteringACM Conference on Management of Data (SIGMOD), 2019[arXiv]
-
-
PRSim: Sublinear Time SimRank Computation on Large Power-Law GraphsACM Conference on Management of Data (SIGMOD), 2019
-
-
Cost-Effective Data Annotation using Game-Based CrowdsourcingInternational Conference on Very Large Data Bases (VLDB), 2018
-
-
TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large GraphsACM Conference on Management of Data (SIGMOD), 2018
-
-
Trajectory Similarity Join in Spatial NetworksInternational Conference on Very Large Data Bases (VLDB), 2017[Poster]
-
-
ProbeSim: Scalable Single-Source and Top-k SimRank Computations on Dynamic GraphsInternational Conference on Very Large Data Bases (VLDB), 2017
-
-
Collective Travel Planning in Spatial NetworksIEEE International Conference on Data Engineering (ICDE), 2017
-
-
Tracking Matrix Approximation over Distributed Sliding WindowsIEEE International Conference on Data Engineering (ICDE), 2017
-
-
FORA: Simple and Effective Approximate Single-Source Personalized PageRankACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2017
-
-
Towards Maximum Independent Sets on Massive GraphsInternational Conference on Very Large Data Bases (VLDB), 2015
-
-
**Equivalence between Priority Queues and Sorting in External MemoryEuropean Symposium on Algorithms (ESA), 2014
-
-
**The Space Complexity of 2-Dimensional Approximate Range CountingACM-SIAM Symposium on Discrete Algorithms (SODA), 2013
-
-
**Mergeable SummariesACM Symposium on Principles of Database Systems (PODS), 2012. (Test of Time Award)
-
-
**Beyond Simple Aggregates: Indexing for Summary QueriesACM Symposium on Principles of Database Systems (PODS), 2011[Slides]
-
-
**Dynamic External Hashing: The Limit of BufferingACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 2009[arXiv]
-
Theses
-
Classic and New Data Structure Problems in External MemoryPhD Dissertation, Hong Kong University of Science and Technology, defended on February 2, 2012
-