如何用Ai做任何事 | Mit How To Ai (Almost) Anything, Spring 2025

https://www.bilibili.com/video/BV1agH8zCE1V/?spm_id_from=333.1387.upload.video_card.click&vd_source=0645a76390602d5640c372c2f44d99e1
Lecturer: https://pliang279.github.io/
image
https://mit-mi.github.io/how2ai-course/spring2025/schedule/
image

Research Project

Research Projects on New Modalities

Motivation: Many tasks of real-world impact go beyond image and text.
Challenges:

  • Al with non-deep-learning effective modalities (e.g., tabular, time-series)
  • Multimodal deep learning + time-series analysis + tabular models
  • Al for physiological sensing, loT sensing in cities, climate and environment sensing
  • Smell, taste, art, music, tangible and embodied systems

Potential models and dataset to start with

Research Projects s on Al Reasoning

Motivation: Robust, reliable, interpretable reasoning in (multimodal) LLMs.
Challenges:

  • Fine-grained and compositional reasoning
  • Neuro-symbolic reasoning
  • Emergent reasoning in foundation models

Potential models and dataset to start with

Research Projects on Interactive Agents

Motivation: Grounding Al models in the web, computer, or other virtual worlds to help humans with digital tasks.
Challenges:

  • Web visual understanding is quite different from natural image understanding
  • Instructions and language grounded in web images, tools, APls
  • Asking for human clarification, human-in-the-loop
  • Search over environment and planning

Potential models and dataset to start with

Research Projects on Embodied and Tangible Al

Motivation: Building tangible and embodied Al systems that help humans in physical tasks.
Challenges:

  • Perception, reasoning, and interaction
  • Connecting sensing and actuation
  • Efficient models that can run on hardware
  • Understanding influence of actions on the world (world model)

Potential models and dataset to start with

Research Projects on Socially Intelligent Al

Motivation: Building Al that can understand and interact
with humans in social situations.
Challenges:

  • Social interaction, reasoning, and commonsense.
  • Building social relationships over months and years.
  • Theory-of-Mind and multi-party social interactions.

Potential models and dataset to start with

Research Projects on Human-Al Interaction

Motivation: What is the right medium for human-Al
interaction? How can we really trust Al? How do we enable collaboration and synergy?
Challenges:

  • Modeling and conveying model uncertainty - text input uncertainty, visual uncertainty, multimodal uncertainty? cross-modal interaction uncertainty?
  • Asking for human clarification, human-in-the-loop, types of human feedback and ways to learn from human feedback through all modalities.
  • New mediums to interact with Al. New tasks beyond imitating humans, leading to collaboration.

Potential models and dataset to start with

Research Projects on Ethics and Safety

Motivation: Large Al models are can emit unsafe text content, generate or retrieve biased images.
Challenges:

  • Taxonomizing types of biases: text, vision, audio, generation, etc.
  • Tracing biases to pretraining data, seeing how bias can be amplified during training, fine-tuning.
  • New ways of mitigating biases and aligning to human preferences.

Potential models and dataset to start with

  • Many works on fairness in LLMs -> how to extend to multimodal?
  • Mitigating bias in text generation, image-captioning, image generation

How Do We Get Research Ideas?

我们如何产生研究想法?

1 Bottom-up(自下而上)

Turn a concrete understanding of existing research's failings to a higher-level experimental question.
意思是:

  • 先深入理解现有研究的不足、漏洞或失败之处
  • 然后把这种具体的问题
  • 抽象提升为一个更高层次的实验问题
    换句话说:从“别人哪里没做好”出发 → 提炼出“更本质的科学问题”
    典型的 problem-driven research generation(问题驱动型研究思路)
    Bottom-up discovery(自下而上的研究想法)
    “从细节往理论走”
  • 从具体实验结果、局部问题、技术瓶颈出发
  • 一点点往上抽象
  • 形成新的研究方向
    Great tool for incremental progress, but may preclude larger leaps
    这种方法很适合渐进式进步(incremental progress),但可能不利于做大跨越式创新(large conceptual leaps)
    总是在“修补现有体系”,而不是重新定义问题框架。

2 Top-down(自上而下)

Move from a higher-level question to a lower-level concrete testing of that question.

  • 先提出一个宏观的、高层次的研究问题
  • 然后把它拆解成可操作的具体实验或验证方式
    理论驱动(theory-driven)或主题驱动(theme-driven)
    Favors bigger ideas, but can be disconnected from reality
  • 容易产生大问题
  • 有可能带来突破性研究
  • 更具战略性和系统性

但是可能脱离现实,缺乏可行性,有时会停留在概念层面
Bottom-up = 解决问题型思维
Top-down = 提问型思维
优秀的研究者通常需要在两种模式之间切换。

posted @ 2026-02-26 14:46  asandstar  阅读(21)  评论(0)    收藏  举报