Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM32 UART/DMA Transfer Request Fixes #116

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tomhepworth
Copy link

@tomhepworth tomhepworth commented Nov 5, 2024

The main purpose of this PR is to address the missing features when trying to use the STM32 UART peripheral with the DMA in the STM32F4. The behaviour should now work more like Figure 316. "Reception using DMA" of the STM32F4 Referance Manual (RM0090).

Hopefully this PR will make it easier to set up and use the Zephyr async uart api on the STM32F4 boards, and enable easier testing of serial protocols with variable length packets.

We implement the IDMA interface in the DMA, and from the UART we request a DMA transfer when value enters data register. If there is a dma peripheral assigned, then when a value appears in the DR, request a transfer.

When a transfer is requested by a peripheral a counter keeps track of how many have been requested. The counter is decremented each time a transfer happens. When the DMA is re-enabled, any transfers that were requested while it was disabled are processed. This is useful for multi-buffering UART, eg with the Zephyr asynchronous UART API.

This includes a fix to the idleLineInterrupt detection in the uart, indended to address renode/renode#690

Below I have included the robot test, zephyr app, and resc file which I have used for testing. The robot test fails on master of renode-infrastructure, but passes on my PR. The zephyr app is an example of a simple serial protocol with variable length packets using the Zephyr Async UART API. The only implemented "opcode" is one to transmit back a packet which was received. Both TX and RX are using the DMA, so if the uart & dma are working, then the packet received should match the one sent.

Changes to board description required to enable the DMA:

Example shown for USART2

usart2: UART.STM32_UART @ sysbus <0x40004400, +0x100>
    IRQ -> nvic@38 | dma1@5
    dmaChannel: 5
    dmaPeripheral: dma1

Robot Test

  *** Settings ***
  Suite Setup     Setup
  Suite Teardown  Teardown
  Test Teardown   Test Teardown
  Resource        ${RENODEKEYWORDS}
  Library		String
  
  *** Variables ***
  ${PLATFORM}             ${CURDIR}/platform.resc
  
  *** Test Cases ***
  Test Uart Loopback Packet 
	  Prepare HW
	  Create Terminal Tester     sysbus.usart2
	  Start Emulation
	  
	  # Note, delibarately chosen a packet where every byte is a valid ascii char. 
	  # This lets RobotFramework convert it to a string for comparison with the 
	  # UART output. Otherwise robot has a hard time converting between 
	  # bytes and strings 
  
	  FOR	${i}     IN RANGE  	   10
	  # Header
	  Send Key To Uart      0x24 
	  Send Key To Uart      0x41 
	  Send Key To Uart      0x22 
  
	  # Data 
	  Send Key To Uart      0x41 
	  Send Key To Uart      0x42 
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41 
	  Send Key To Uart      0x42 
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x42
	  Send Key To Uart      0x41
	  Send Key To Uart      0x5B
	  
	  # Checksum
	  Send Key To Uart      0x5D
  
	  ${result}=	Wait For Line On Uart     $A"ABABABABABABABABABABABABABABABABA[]     includeUnfinishedLine=True 	timeout=3
	  Clear Terminal Tester Report 
	  END
  
  *** Keywords ***
  Prepare HW
          Execute Script  ${PLATFORM}

Zephyr App

  #include <stdio.h>
  #include <string.h>
  
  #include <zephyr/device.h>
  #include <zephyr/drivers/uart.h>
  #include <zephyr/kernel.h>
  
  #define RX_TIMEOUT 0
  #define RX_BUF_SIZE 600
  #define TX_BUF_SIZE 260
  
  #define PACKET_START_BYTE 0x24
  
  #define UART_DEVICE_NODE DT_ALIAS(comms)
  static const struct device *const uart_device = DEVICE_DT_GET(UART_DEVICE_NODE);
  
  struct uart_config uart_config = {.baudrate = 115200,
                                    .parity = UART_CFG_PARITY_NONE,
                                    .stop_bits = UART_CFG_STOP_BITS_1,
                                    .data_bits = UART_CFG_DATA_BITS_8,
                                    .flow_ctrl = UART_CFG_FLOW_CTRL_NONE};
  
  static char tx_buf[TX_BUF_SIZE] = {};
  static char rx_buf[RX_BUF_SIZE];
  static char *rx_buf_a = rx_buf;
  static char *rx_buf_b = rx_buf + RX_BUF_SIZE / 2;
  static char *rx_buffer_list[2] = {0, 0};
  static size_t selected_rx_buffer = 0;
  
  enum COMMAND_OPCODE {
    OPC_TEST_LOOPBACK = 65,
    OPC_UNUSED
  };
  
  typedef struct packet {
    uint8_t special_char;
    enum COMMAND_OPCODE opc;
    uint8_t data_len;
    uint8_t data[255];
    uint8_t checksum;
  } packet_t;
  
  static packet_t active_packet = {0x24, OPC_UNUSED, 0x00, {}, 0x00};
  
  void send_packet(enum COMMAND_OPCODE opc, uint8_t data_len,
                   uint8_t *data) {
  
    uint8_t checksum = 0;
  
    checksum ^= (uint8_t)PACKET_START_BYTE;
    checksum ^= (uint8_t)opc;
    checksum ^= (uint8_t)data_len;
  
    // Header 
    tx_buf[0] = PACKET_START_BYTE;
    tx_buf[1] = (uint8_t)opc;
    tx_buf[2] = (uint8_t)data_len;
  
    // Data
    size_t i = 3;
    for (; i < data_len + 3; i++) {
      checksum ^= (uint8_t)data[i-3];
      tx_buf[i] = (uint8_t)data[i-3];
    }
  
    tx_buf[i] = checksum;
  
    uart_tx(uart_device, tx_buf, data_len + 4, 1000);
    return;
  }
  
  void handle_packet(packet_t *p) {
    switch (p->opc) {
      case OPC_TEST_LOOPBACK:
        send_packet(p->opc, p->data_len, p->data);
        break; 
    default:
      break;
    }
  }
  
  size_t parse_packet(uint8_t *buf, size_t start_offset) {
    printf("starting from ====== %d \n", start_offset);
    size_t new_offset = start_offset;
  
    // Skip bytes until 0x24 seen
    while (buf[new_offset] != PACKET_START_BYTE) {
      new_offset = (new_offset + 1) % RX_BUF_SIZE;
    }
  
    // For each received byte, parse it into the active packet struct.
    // Keep a local copy of each value in order to compute checksum.
  
    // First byte: special char
    uint8_t start_byte = buf[new_offset];
    active_packet.special_char = buf[new_offset];
    new_offset = (new_offset + 1) % RX_BUF_SIZE;
  
    // Second byte: Opcode
    uint8_t opcode = buf[new_offset];
    active_packet.opc = buf[new_offset];
    new_offset = (new_offset + 1) % RX_BUF_SIZE;
  
    // Third byte, Data len
    uint8_t len = buf[new_offset];
    active_packet.data_len = buf[new_offset];
    new_offset = (new_offset + 1) % RX_BUF_SIZE;
  
    // Final byte: Checksum
    uint8_t checksum = buf[(new_offset + len) % RX_BUF_SIZE];
    active_packet.checksum = buf[(new_offset + len) % RX_BUF_SIZE];
  
    // Now read all the data bytes and compute checksum
    // Also copy them into the active_packet data array
    uint8_t checksum_local = start_byte ^ opcode ^ len;
    size_t data_ptr = new_offset;
    size_t packet_data_index = 0;
    while (data_ptr < new_offset + len) {
      checksum_local ^= buf[data_ptr];
      active_packet.data[packet_data_index] = buf[data_ptr];
      data_ptr = (data_ptr + 1) % RX_BUF_SIZE;
      packet_data_index++;
    }
  
    if (checksum == checksum_local) {
      // Wipe packet after it is handled
      size_t clear_index = start_offset;
      while (clear_index != (new_offset + len + 1) % RX_BUF_SIZE) {
        buf[clear_index] = 0x00;
        clear_index = (clear_index + 1) % RX_BUF_SIZE;
      }
  
      handle_packet(&active_packet);
  
      // Return position in buffer of the end of the packet + 1
      return (new_offset + len + 1) % RX_BUF_SIZE;
  
    } else {
      // If the packet was invalid, dont return a new offset.
      // Perhaps the full packet had not arrived yet.
      return start_offset;
    }
  }
  
  void uart_callback(const struct device *dev, struct uart_event *evt,
                     void *user_data) {
  
    ARG_UNUSED(user_data);
  
    switch (evt->type) {
    case UART_TX_DONE:
      break;
    case UART_TX_ABORTED:
      uart_tx_abort(uart_device);
      break;
    case UART_RX_RDY:
      break;
    case UART_RX_BUF_REQUEST:
      selected_rx_buffer = (selected_rx_buffer + 1) % 2;
      int ret = uart_rx_buf_rsp(dev, rx_buffer_list[selected_rx_buffer],
                                RX_BUF_SIZE / 2);
      break;
    case UART_RX_BUF_RELEASED:
      break;
    case UART_RX_DISABLED:
      break;
    case UART_RX_STOPPED:
      break;
    default:
      break;
    };
  }
  
  void thread_uart_entry_point(void *dummy1, void *dummy2, void *dummy3) {
  
    ARG_UNUSED(dummy1);
    ARG_UNUSED(dummy2);
    ARG_UNUSED(dummy3);
  
    if (!device_is_ready(uart_device)) {
      printk("UART device not ready\n");
      return;
    }
  
    // Register rx_buffers
    rx_buffer_list[0] = rx_buf_a;
    rx_buffer_list[1] = rx_buf_b;
    selected_rx_buffer = 0;
  
    int ret = 0;
    ret = uart_callback_set(uart_device, uart_callback, 0);
  
    uart_rx_disable(uart_device);
  
    ret = uart_rx_enable(uart_device, rx_buffer_list[selected_rx_buffer],  RX_BUF_SIZE / 2, RX_TIMEOUT);
  
    size_t rx_parse_offset = 0;
    while (1) {
      rx_parse_offset = parse_packet(rx_buf_a, rx_parse_offset);
      k_sleep(K_MSEC(100));
    }
  
  }
  
  extern const k_tid_t thread_uart_id;
  K_THREAD_DEFINE(thread_uart_id, 1024, thread_uart_entry_point, NULL, NULL, NULL,
                  7, 0, 0);
  
  int main(void) {
    while (1) {
      int ret = 0;
      
      k_sleep(K_MSEC(1000));
    }
  
    return 0;
  }

Kconfig

  CONFIG_SERIAL=y
  CONFIG_UART_ASYNC_API=y
  
  CONFIG_THREAD_NAME=y
  CONFIG_SCHED_CPU_MASK=y

DTC overlay

    aliases {
    comms = &usart2;
    };
    
    ...
    
    usart2: serial@40004400 {
	    compatible = "st,stm32-usart", "st,stm32-uart";
	    reg = < 0x40004400 0x400 >;
	    clocks = < &rcc 0x40 0x20000 >;
	    resets = < &rctl 0x411 >;
	    interrupts = < 0x26 0x0 >;
	    status = "okay";
	    pinctrl-0 = < &usart2_tx_pa2 &usart2_rx_pa3 >;
	    pinctrl-names = "default";
	    current-speed = < 0x1c200 >;
	    dmas = <&dma1 0x6 0x4 0x440 0x0>,
			         <&dma1 0x5 0x4 0x480 0x0>;
	    dma-names = "tx", "rx";
    };

platform.resc

:name: Test platform
:description: stm32 uart/dma Test platform

using sysbus
$name?="STM32 UART/DMA Test"
$bin?=@build/zephyr/zephyr.elf

mach create $name
machine LoadPlatformDescription @stm32f4_discovery-kit.repl
cpu PerformanceInMips 125

connector Connect sysbus.usart2 term
showAnalyzer usart2
showAnalyzer usart1
showAnalyzer uart4
logLevel -1  sysbus.dma1
logLevel -1  sysbus.usart2

sysbus LogPeripheralAccess dma1
sysbus LogPeripheralAccess usart2

emulation SetGlobalSerialExecution true

macro reset
"""
    sysbus LoadELF $bin
    cpu VectorTableOffset 0x08000000
    cpu EnableZephyrMode
    s
"""

runMacro $reset

If the token is evaluated after the timer checks it then the cancellation
token source has changed, and the new value is checked. This caused
every timer to cause an idle line interrupt. Evaluating the token before
starting the timer means that the cancellation status is not overwritten
by new tokens.
This commit attempts to reproduce the behaviour descrbed in Figure
316. "Reception using DMA" of the STM32F4 Referance Manual (RM0090)

We implement the IDMA interface in the DMA, and from the UART we
request a DMA transfer when value enters data register.
If there is a dma peripheral assigned, then when a value appears in the
DR, request a transfer.

When a transfer is requested by a peripheral a counter keeps track of
how many have been requested. The counter is decremented each time a
transfer happens. When the DMA is re-enabled, any transfers that were
requested while it was disabled are processed. This is useful for
multi-buffering UART, eg with the Zephyr asynchronous UART API.
@CLAassistant
Copy link

CLAassistant commented Nov 5, 2024

CLA assistant check
All committers have signed the CLA.

@tomhepworth
Copy link
Author

tomhepworth commented Nov 7, 2024

There is also the problem in the DMA where the SxNDTR register is never decremented by a transfer. This means the Zephyr driver for async uart never triggers the UART_RX_RDY event when the IDLE interrupt happens. I have fixed it locally and will add to this PR

The Reference manual for the STM32F4 Says:

Each DMA transfer consists of three operations:
• A loading from the peripheral data register or a location in memory, addressed through
the DMA_SxPAR or DMA_SxM0AR register
• A storage of the data loaded to the peripheral data register or a location in memory
addressed through the DMA_SxPAR or DMA_SxM0AR register
• A post-decrement of the DMA_SxNDTR register, which contains the number of
transactions that still have to be performed

But this post decrement is not currently in the DMA model.

Copy link
Member

@mateusz-holenko mateusz-holenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tomhepworth, thanks for your contribution!

It looks generally fine, there is just one question.

@@ -399,6 +429,8 @@ private static uint FromTransferType(TransferType transferType)
}
throw new InvalidOperationException("Should not reach here.");
}

public uint transferBacklog;
Copy link
Member

@mateusz-holenko mateusz-holenko Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if having a counter here is correct.

DoPeripheralTransfer uses transfer parameters set by HandleConfigurationWrite. This means that once a channel is enabled, all the queued transfers will use the same configuration (which I believe can be different than the one when the transfer was "scheduled").

Is the counter logic necessary to pass your example?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point which I have not tested.

It is necessary in the case where the DMA runs out of space in a buffer while the uart is still receiving data. In its original state the DMA will miss some bytes. This happens because during the buffer swap procedure in the zephyr driver a UART interrupt register is reset which causes a read from the UART data register, dequeuing a value. Unless the amount of requested transfers is kept track of and processed when the DMA is re-enabled and before the UART is reconfigured, then some data is lost.

I am not 100% sure what the real DMA hardware does in this case, but in the figure I mention in the PR description it shows a request being generated which I assumed would be queued/buffered somehow.

I will investigate

Copy link
Author

@tomhepworth tomhepworth Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mateusz-holenko

Without the counter, wouldn't the dma just do the same thing but also lose the data that my implementation fixes?

If data is being streamed and the DMA is reconfigured halfway through, then the second half will have the new settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants