Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

character encoding and decoding functions #25387

Closed
r0bnet opened this issue Jun 25, 2020 · 6 comments
Closed

character encoding and decoding functions #25387

r0bnet opened this issue Jun 25, 2020 · 6 comments

Comments

@r0bnet
Copy link
Contributor

r0bnet commented Jun 25, 2020

Current Terraform Version

> tf version
Terraform v0.12.26

Use-cases

For example I want to create a VM in Azure with a Storage Account /w file share. After VM provisioning I want to attach that file share as mount. Problem is that the template file has to be UTF-8 encoded but PowerShell only accepts UTF-16 encoded strings when using -EncodedCommand.

Attempted Solutions

resource "azurerm_windows_virtual_machine" "myvm" {
  name                  = "myvm"
  resource_group_name   = "myresourcegroup"
  location              = "northeurope"
  size                  = "Standard_B2s"
  network_interface_ids = [...]

...
}

resource "azurerm_storage_account" "storage_account" {
  name                     = "mysuperstorageaccount"
  resource_group_name      = myresourcegroup
  location                 = "northeurope"
  account_kind             = "StorageV2"
  account_tier             = "Standard"
  account_replication_type = "LRS"
  access_tier              = "Cool"

  enable_https_traffic_only = true
}

resource "azurerm_storage_share" "files" {
  name                 = "files"
  storage_account_name = azurerm_storage_account.storage_account.name
  quota                = 5
}

locals {
  connect_file_share_script = templatefile("${path.module}/connect-azure-file-share.tpl.ps1", {
    storage_account_file_host = azurerm_storage_account.storage_account.primary_file_host
    storage_account_name      = azurerm_storage_account.storage_account.name
    storage_account_key       = azurerm_storage_account.storage_account.primary_access_key
    file_share_name           = azurerm_storage_share.files.name
    drive_letter              = "Z"
  })
}

resource "azurerm_virtual_machine_extension" "attach_file_share" {
  name                 = "attach_file_share"
  virtual_machine_id   = azurerm_windows_virtual_machine.myvm.id
  publisher            = "Microsoft.Compute"
  type                 = "CustomScriptExtension"
  type_handler_version = "1.10"

  settings = <<SETTINGS
    {
      "commandToExecute": "powershell -EncodedCommand ${base64encode(local.connect_file_share_script)}"
    }
SETTINGS

  depends_on = [azurerm_storage_share.files]
}

connect-azure-file-share.tpl.ps1

$connectTestResult = Test-NetConnection -ComputerName ${storage_account_file_host} -Port 445
if ($connectTestResult.TcpTestSucceeded) {
    cmd.exe /C "cmdkey /add:`"${storage_account_file_host}`" /user:`"Azure\${storage_account_name}`" /pass:`"${storage_account_key}`""
    New-PSDrive -Name ${drive_letter} -PSProvider FileSystem -Root "\\${storage_account_file_host}\${file_share_name}" -Persist
} else {
    Write-Error -Message "Unable to reach the Azure storage account via port 445. Check to make sure your organization or ISP is not blocking port 445, or use Azure P2S VPN, Azure S2S VPN, or Express Route to tunnel SMB traffic over a different port."
}

Proposal

Not sure if that's currently possible as I can imagine that TF currently ONLY supports UTF-8 everywhere.

References

Terraform Error

Call to function "templatefile" failed: contents of
../../modules/mymodule/connect-azure-file-share.tpl.ps1 are not valid UTF-8;
use the filebase64 function to obtain the Base64 encoded contents or the other
file functions (e.g. filemd5, filesha256) to obtain file hashing results
instead.

PowerShell output

-EncodedCommand | -e | -ec
    Accepts a base64-encoded string version of a command. Use this parameter to
    submit commands to PowerShell that require complex quotation marks or curly
    braces. The string must be formatted using UTF-16 character encoding.
    For example:
    $command = 'dir "c:\program files" '
    $bytes = [System.Text.Encoding]::Unicode.GetBytes($command)
    $encodedCommand = [Convert]::ToBase64String($bytes)
    pwsh -encodedcommand $encodedCommand
@apparentlymart
Copy link
Contributor

Hi @r0bnet,

As you've seen, Terraform strings are sequences of unicode characters rather than sequences of bytes, and as a consequence of that Terraform must always apply some sort of character encoding when moving to or from stored external data, such as files on disk. By convention, Terraform's own functions all use UTF-8 because that is a standard encoding that is relatively easy to create on all modern platforms, and the Terraform language itself is encoded in UTF-8.

Of course, we do sometimes need to interact with other systems that have different expectations, and so the convention for that is to represent sequences of bytes via base64 encoding, thus allowing the byte sequence to be stored (encoded) in a Terraform language string. That's generally fine as long as Terraform is treating the byte sequence as opaque, but doesn't work for cases where you want Terraform to perform character-based operations (such as template rendering, in your case) because Terraform would then not be able to understand the template sequences within.

Fortunately it looks like this -EncodedCommand option you want to use also expects data in base64 encoding, and so it's consistent with Terraform's conventions. The PowerShell output you showed includes a PowerShell example of encoding the string as UTF16-in-Base64 as a separate step, and I think that'd be the most practical answer for Terraform too, which would imply adding character encoding and decoding functions to Terraform.

With that said, my initial proposal would be to add two new functions to terraform, named encodetextbase64 and decodetextbase64:

  • encodetextbase64 would take a normal (unencoded) string and an IANA character encoding name and return a base64-encoded representation of the string in the requested encoding. For example, encodetextbase64("dir \"c:\\program files\" ", "UTF-16") would produce an equivalent result as $encodedCommand in that PowerShell example in your issue comment.
  • decodetextbase64 would then perform the opposite operation: take a string containing base64 characters along with an IANA character encoding name and attempt to decode base64 and then decode the characters into a normal (unencoded) Terraform string. decodetextbase64(encodetextbase64("Hello", "UTF-16"), "UTF-16") would therefore produce the same result as just "Hello".

An implication of that design is that you would still need to encode the template itself in UTF-8 so that Terraform can interpret it, but then you can pass the result of template rendering to encodetextbase64 to get the UTF16-in-Base64 encoding that PowerShell is expecting.

locals {
  connect_file_share_script = templatefile("${path.module}/connect-azure-file-share.tpl.ps1", {
    storage_account_file_host = azurerm_storage_account.storage_account.primary_file_host
    storage_account_name      = azurerm_storage_account.storage_account.name
    storage_account_key       = azurerm_storage_account.storage_account.primary_access_key
    file_share_name           = azurerm_storage_share.files.name
    drive_letter              = "Z"
  })
}

resource "azurerm_virtual_machine_extension" "attach_file_share" {
  name                 = "attach_file_share"
  virtual_machine_id   = azurerm_windows_virtual_machine.myvm.id
  publisher            = "Microsoft.Compute"
  type                 = "CustomScriptExtension"
  type_handler_version = "1.10"

  settings = jsonencode({
    commandToExecute = "powershell -EncodedCommand ${encodetextbase64(local.connect_file_share_script, "UTF-16")}"
  })
}

Terraform already has a (currently, indirect) dependency on golang.org/x/text, so we can use the IANA index from golang.org/x/text/encoding/ianaindex to handle the second argument to each of the new functions in a generic way, without having to maintain a separate character encoding index inside the Terraform codebase itself.

@r0bnet
Copy link
Contributor Author

r0bnet commented Jun 25, 2020

Hey @apparentlymart,

Thanks for your answer. This is definitely a good approach. I was thinking of doing something like that on server side (with the extension) but there's always this escaping stuff why I originally wanted to use the encoded command.
I'd be happy if something like this would be implemented because I can imagine that this feature could also be useful in other cases + I don't think this function is a big deal ;)

@apparentlymart
Copy link
Contributor

Thanks for confirming that this proposal looks like it would meet your use-case, @r0bnet!

This is not something that the Terraform team at HashiCorp will be able to work on in the near future because our focus is currently elsewhere, but if you or someone else would like to work on a proposed implementation then we'd be happy to review a pull request for it.

@apparentlymart apparentlymart changed the title Allow non UTF-8 file content for templatefile character encoding and decoding functions Jun 25, 2020
@r0bnet
Copy link
Contributor Author

r0bnet commented Jul 2, 2020

Yeah, i'll try to create a PR for that because it would help me (and potentially others) a lot. Will update this issue with PR # when done.

// Update:

PR #25470

@apparentlymart
Copy link
Contributor

It looks like we neglected to close this out after merging #25470. Whoops! 🤦‍♂️

Thanks again for contributing the PR. 😀

@github-actions
Copy link
Contributor

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants