Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove hardcoded device="cuda" to support more device #2503

Merged
merged 10 commits into from
Feb 1, 2024

Conversation

jikunshang
Copy link
Contributor

Refer to #1948 , there are a lot of code use cuda as device, especially in tensor creation, which is not friendly to add other device support. This PR aims to refactor the code to leave some interface for better and easily add new device like cpu or xpu

Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jikunshang Thanks for submitting the PR! While the code looks good overall, I have two concerns:

  1. I believe device should be automatically detected instead of being declared by users. This also aligns with our current design. While I don't know a good way to implement this is, I feel there should be a way to do it as long as the device is supported by PyTorch.
  2. I believe device should not be an attribute of ModelConfig. Can we make a new config class like DeviceConfig?

@@ -147,18 +148,21 @@ def _prepare_prompt(
input_tokens = _make_tensor_with_pad(input_tokens,
max_prompt_len,
pad=0,
dtype=torch.long)
dtype=torch.long,
device=self.device)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the _set_default_torch_device context manager here to not repeat device=self.device?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After second thoughts, I found that explicitly specifying the devices would be better as we might mix CPU and accelerator in some cases.

@jikunshang jikunshang force-pushed the main_0118 branch 2 times, most recently from 2df9f74 to 114a846 Compare January 22, 2024 03:36
@jikunshang jikunshang requested a review from WoosukKwon January 23, 2024 01:44
@jikunshang jikunshang force-pushed the main_0118 branch 3 times, most recently from 37ff8f3 to 88782e0 Compare January 26, 2024 02:20
@WoosukKwon
Copy link
Collaborator

Hi @jikunshang, could you resolve the merge conflicts? The PR looks good overall.

@jikunshang
Copy link
Contributor Author

Hi @jikunshang, could you resolve the merge conflicts? The PR looks good overall.

Sure. I have resolved conflicts. All tests should have been fixed.

self.device = torch.device(torch.cuda.current_device())
self.device_config = (device_config
if device_config is not None else DeviceConfig())
self.device = self.device_config.device
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I intentionally avoided using torch.set_current_device, since this can affect the user code when using LLM class.

Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jikunshang LGTM! Thanks for submitting the PR and sorry for the delay in my second review.

While I think vLLM still has several torch.cuda calls, I believe this is a good first step towards supporting non-CUDA devices. Thanks for the great work!

@WoosukKwon WoosukKwon merged commit 96b6f47 into vllm-project:main Feb 1, 2024
15 of 17 checks passed
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
alexm-redhat pushed a commit to neuralmagic/nm-vllm that referenced this pull request Feb 13, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 20, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 22, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants